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Preface 



‘Der Kopf, so gesehen, hat mit dem Kopf, so gesehen, auch nicht die leiseste Ahnlichkeit 
(...) Der Aspektwechsel. “Du wiirdest doch sagen, dass sich das Bild jetzt ganzlich 
geandert hat!” Aber was ist anders: mein Eindruck? meine Stellungnahme? (...) Ich 
beschreibe die Anderung wie eine Wahmehmung, ganz, als hatte sich der Gegenstand vor 
meinen Augen geandert.’ (Wittgenstein, Philosophische Untersuchungen II, §§127, 129).* 

As the well-known picture above is meant to allegorize, some physical systems 
admit a dual description in either classical or quantum-mechanical terms. According 
to Bohr’s “doctrine of classical concepts”, measurement apparatuses are examples 
of such systems. More generally—as hammered down by decoherence theorists— 
the classical world around us is a case in point. As will be argued in this book, the 
measurement problem of quantum mechanics (highlighted by Schrodinger’s Cat) is 
caused by this duality (rather than resolved by it, as Bohr is said to have thought). 

* ‘The head seen in this way hasn’t even the slightest similarity to the head seen in that way (...) 
The change of aspect. “But surely you’d say that the picture has changed altogether now! But what 
is different: my impression? my attitude? (...) 1 describe the change like a perception; just as if the 
object has changed before my eyes.’ Translation: G.E.M. Anscombe, P.M.S. Hacker, & J. Schulte 
(Wittgenstein, 2009/1953, pp. 205-206). 
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viii 


The aim of this book is to analyze the foundations of quantum theory from the 
point of view of classical-quantum duality, using the mathematical formalism of 
operator algebras on Hilbert space (and, more generally, C*-algebras) that was orig¬ 
inally created by von Neumann (followed by Gelfand and Naimark). In support of 
this analysis, but also as a matter of independent interest, the book covers many of 
the traditional topics one might expect to find in a treatise on the foundations of 
quantum mechanics, like pure and mixed states, observables, the Born rule and its 
relation to both single-case probabilities and long-run frequencies, Gleason’s Theo¬ 
rem, the theory of symmetry (including Wigner’s Theorem and its relatives, culmi¬ 
nating in a recent theorem of Hamhalter’s), Bell’s Theorem(s) and the like, quantiza¬ 
tion theory, indistinguishable particle, large systems, spontaneous symmetry break¬ 
ing, the measurement problem, and (intuitionistic) quantum logic. One also finds 
a few idiosyncratic themes, such as the Kadison-Singer Conjecture, topos theory 
(which naturally injects intuitionism into quantum logic), and an unusual emphasis 
on both conceptual and mathematical aspects of limits in physical theories. 

All of this is held together by what we call Bohrification, i.e., the mathematical 
interpretation of Bohr’s classical concepts by commutative C*-algebras, which in 
turn are studied in their quantum habitat of noncommutative C*-algebras. 

Thus the book is mostly written in mathematical physics style, but its real subject 
is natural philosophy. Hence its intended readership consists not only of mathemati¬ 
cal physicists, but also of philosophers of physics, as well as of theoretical physicists 
who wish to do more than ‘shut up and calculate’, and finally of mathematicians who 
are interested in the mathematical and conceptual structure of quantum theory. 

To serve all these groups, the native mathematical language (i.e. of C*-algebras) 
is introduced slowly, starting with finite sets (as classical phase spaces) and finite¬ 
dimensional Hilbert spaces. In addition, all advanced mathematical background that 
is necessary but may distract from the main development is laid out in extensive 
appendices on Hilbert spaces, functional analysis, operator algebras, lattices and 
logic, and category theory and topos theory, so that the prerequisites for this book 
are limited to basic analysis and linear algebra (as well as some physics). These 
appendices not only provide a direct route to material that otherwise most readers 
would have needed to extract from thousands of pages of diverse textbooks, but they 
also contain some original material, and may be of interest even to mathematicians. 
In summary, the aims of this book are similar to those of its peerless paradigm: 

‘Der Gegenstand dieses Buches ist die einheitliche, und, soweit als moglich und angebracht, 
mathematisch einwandfreie Darstellung der neuen Quantenmechanik (...). Dabei soil das 
Hauptgewicht auf die allgemeinen und prinzipiellen Fragen, die im Zusammenhange mit 
dieser Theorie entstanden sind, gelegt werden. Insbesondere sollen die schwierigen und 
vielfach noch immer nicht restlos geklarten Interpretationsfragen niiher untersucht werden.’ 
(von Neumann, Mathematische Grundlagen der Quantenmechanik, 1932, p. 1).^ 


^ ‘The object of this book is to present the new quantum mechanics in a unified presentation which, 
so far as it is possible and useful, is mathematically rigorous. (...) Therefore the principal emphasis 
shall be placed on the general and fundamental questions which have arisen in connection with this 
theory. In particular, the difficult problems with interpretation, many of which are even now not 
fully resolved, will be investigated in detail.’ Translation: R.T. Beyer (von Neumann, 1955, p. vii). 
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Preface ix 

Two other quotations the author often had in mind while writing this book are: 

‘And although the whole of philosophy is not immediately evident, still it is better to add 
something to our knowledge day by day than to fill up men’s minds in advance with the 
preconceptions of hypotheses.’ (Newton, draft preface to Principia, 1686).^ 

‘Juist het feit dat een genie als DESCARTES volkomen naast de lijn van ontwikkeling is bli- 
jven staan, die van GALILEI naar Newton voert (...) [is] een phase van den in de historic 
zoo vaak herhaalden strijd tusschen de bescheidenheid der mathematisch-physische meth- 
ode, die na nauwkeurig onderzoek de verschijnselen der natuur in steeds meer omvattende 
schemata met behulp van de exacte taal der mathesis wil beschrijven en den hoogmoed van 
het philosophische denken, dat in een genialen greep de heele wereld wil omvatten (...).’ 
(Dijksterhuis, Val en Worp, 1924, p. 343).^ 
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Introduction 


After 25 years of confusion and even occasional despair, in March 1926 physicists 
suddenly had two theories of the microscopic world (Heisenberg, 1925; Schrodinger, 
1926ab), which hardly could have looked more differently. Heisenberg’s matrix me¬ 
chanics (as it came to be called a bit later) described experimentally measurable 
quantities (i.e., “observables”) in terms of discrete quantum numbers, and appar¬ 
ently lacked a state concept. Schrodinger’s wave mechanics focused on unobserv¬ 
able continuous matter waves apparently playing the role of quantum states; at the 
time the only observable within reach of his theory was the energy. Einstein is even 
reported to have remarked in public that the two theories excluded each other. 

Nonetheless, Pauli (in a letter to Jordan dated 12 April 1926), Schrodinger 
(1926c) himself, Eckart (1926), and Dirac (1927) argued—it is hard to speak of 
a complete argument even at a heuristic level, let alone of a mathematical proof 
(Muller, 1997ab)— that in fact the two theories were equivalent! A rigorous equiv¬ 
alence proof was given by von Neumann (1927ab), who (at the age of 23) was the 
first to unearth the mathematical structure of quantum mechanics as we still under¬ 
stand it today. His effort, culminating in his monograph Mathematische Grundlagen 
der Quantenmechanik (von Neumann, 1932), was based on the abstract concept of 
a Hilbert space, which previously had only appeared in examples (i.e. specific real¬ 
izations) going back to the work of Hilbert and his school on integral equations. 

The novelty of von Neumann’s abstract approach may be illustrated by the advice 
Hilbert’s former student Schmidt gave to von Neumann even at the end of the 1920s: 

‘Nein! Nein! Sagen Sie nicht Operator, sagen Sie Matrix!” (Bemkopf, 1967, p. 346).^ 

Von Neumann proposed that observables quantities be interpreted as (possibly un¬ 
bounded) self-adjoint operators on some Hilbert space, whilst pure states are real¬ 
ized as rays (i.e. unit vectors up to a phase) in the same space; finally, the inner prod¬ 
uct provides the probabilities introduced by Born (1926ab). In particular, Heisen¬ 
berg’s observables were operators on f^(N), whereas Schrodinger’s wave-functions 
were unit vectors in A unitary transformation between these Hilbert spaces 

then provided the mathematical equivalence between their competing theories. 

^ ‘No! No! You shouldn’t say operator, you should say matrix!’ 
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Introduction 


This story is well known, but it is worth emphasizing (cf. Zalamea, 2016, §1.1) 
that the most significant difference between von Neumann’s mathematical axiom- 
atization of quantum mechanics and Dirac’s heuristic but beautiful and systematic 
treatment of the same theory (Dirac, 1930) was not so much the lack of mathemat¬ 
ical rigour in the latter—although this point was stressed by von Neumann (1932, 
p. 2) himself, who was particularly annoyed with Dirac’s 5-function and his closely 
related assumption that every self-adjoint operator can be diagonalized in the naive 
way of having a basis of eigenvectors—^but the fact that Dirac’s approach was rela¬ 
tive to the choice of a (generalized) basis of a Hilbert space, whereas von Neumann’s 
was absolute. In this sense, as a special case of his (and Jordan’s) general transfor¬ 
mation theory, Dirac showed that Heisenberg’s matrix mechanics and Schrodinger’s 
wave mechanics were related by a (unitary) transformation, whereas for von Neu¬ 
mann they were two different realizations of his abstract (separable) Hilbert space. 
In particular, von Neumann’s approach a priori dispenses with a basis choice alto¬ 
gether; this is precisely the difference between an operator and a matrix Schmidt al¬ 
luded to in the above quotation. Indeed, von Neumann’s abstract approach (which as 
a co-founder of functional analysis he shared with Banach, but not with his mentor 
Hilbert) was remarkable even in mathematics; in physics it must have been dazzling. 

It is instructive to compare this situation with special relativity, where, so to 
speak, Dirac would write down the theory in terms of inertial frames of reference, 
so as to subsequently argue that due to Poincare-invariance the physical content of 
the theory does not depend on such a choice. Von Neumann, on the other hand (had 
he ever written a treatise on relativity), would immediately present Minkowski’s 
space-time picture of the theory and develop it in a coordinate-free fashion. 

However, this analogy is also misleading. In special relativity, all choices of iner¬ 
tial frames are genuinely equivalent, but in quantum mechanics one often does have 
preferred observables: as Bohr would argue from his Como Lecture in 1927 onwards 
(Bohr, 1928), these observables are singled out by the choice of some experimental 
context, and they are jointly measurable iff they commute (see also below). Though 
not necessarily developed with Bohr’s doctrine in mind, Dirac’s approach seems 
tailor-made for this situation, since his basis choice is equivalent to a choice of 
“preferred” physical observables, namely those that are diagonal in the given basis 
(for Heisenberg this was energy, while for Schrodinger it was position). 

Von Neumann’s abstract approach can deal with preferred observables and ex¬ 
perimental contexts, too, though the formalism for doing so is more demanding. 
Namely, for reasons ranging from quantum theory to ergodic theory via unitary 
group representations on Hilbert space, from 1930 onwards von Neumann devel¬ 
oped his theory of “rings of operators” (nowadays called von Neumann algebras), 
partly in collaboration with his assistant Murray (von Neumann, 1930, 1931, 1938, 
1940, 1949; Murray & von Neumann, 1936, 1937, 1943). For us, at least at the 
moment the point is that Dirac’s diagonal observables are formalized by maximal 
commutative von Neumann algebras A on some Hilbert space. These often come 
naturally with some specific realization of a Hilbert space; for example, on Heisen¬ 
berg’s Hilbert space f^(N) on has Ad = £“’(N), while Schrodinger’s is host 

to Ac = both realized as multiplication operators (cf. Proposition B.73). 
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Although the second (1931) paper in the above list shows that von Neumann was 
well aware of the importance of the commutative case of his theory of operator al¬ 
gebras, he—perhaps deliberately—missed the link with Bohr’s ideas. As explained 
in the remainder of this Introduction, providing this link is one of the main themes 
of this book, but we will do so using the more powerful formalism of C*-algebras. 
Introduced by Gelfand & Naimark (1943), these are abstractions and generaliza¬ 
tions of von Neumann algebras, so abstract indeed that Hilbert spaces are not even 
mentioned in their definition. Nonetheless, C*-algebras remain very closely tied to 
Hilbert spaces through the GNS-construction originating with Gelfand & Naimark 
(1943) and Segal (1947b), which implies that any C*-algebra is isomorphic to a 
well-behaved algebra of bounded operators on some Hilbert space (see §C.12). 

Starting with Segal (1947a), C*-algebras have become an important tool in math¬ 
ematical physics, where traditionally most applications have been to quantum sys¬ 
tems with infinitely many degrees of freedom, such as quantum statistical mechan¬ 
ics in infinite volume (Ruelle, 1969; Israel, 1979; Bratteli & Robinson, 1981; Haag, 
1992; Simon, 1993) and quantum field theory (Haag, 1992; Araki, 1999). 

Although we delve from the first body of literature, and were at least influenced 
by the second, the present book employs C*-algebras in a rather different fashion, 
in that we exploit the unification they provide of the commutative and the noncom- 
mutative “worlds” into a single mathematical framework (where one should note 
that as far as physics is concerned, the commutative or classical case is not purely 
C*-algebraic in character, because one also needs a Poisson structure, see Chapter 
3). This unified language (supplemented by some category theory, group(oid) the¬ 
ory, and differential geometry) gives a mathematical handle on Wittgenstein’s As- 
pektwechsel between classical and quantum-mechanical modes of description (see 
Preface), which in our view lies at the heart of the foundations of quantum physics. 
This “change of perspective”, which roughly speaking amounts to switching (and 
interpolating) between commutative and noncommutative C*-algebras, is added to 
Dirac’s transformation theory (which comes down to switching between generalized 
bases, or, equivalently, between maximal commutative von Neumann algebras). 

The central conceptual importance of the Aspektwechsel for this book in turn 
derives from our adherence to Bohr’s doctrine of classical concepts, which forms 
part of the Copenhagen Interpretation of quantum mechanics (here defined strictly 
as a body of ideas shared by Bohr and Heisenberg). We let the originators speak: 

‘It is decisive to recognize that, however far the phenomena transcend the scope of classical 
physical explanation, the account of all evidence must be expressed in classical terms. The 
argument is simply that by the word experiment we refer to a situation where we can tell 
others what we have done and what we have learned and that, therefore, the account of 
the experimental arrangements and of the results of the observations must be expressed in 
unambiguous language with suitable application of the terminology of classical physics.’ 
(Bohr, 1949, p. 209) 

‘The Copenhagen interpretation of quantum theory starts from a paradox. Any experiment 
in physics, whether it refers to the phenomena of daily life or to atomic events, is to be 
described in the terms of classical physics. The concepts of classical physics form the lan¬ 
guage by which we describe the arrangement of our experiments and state the results. We 
cannot and should not replace these concepts by any others.’ (Heisenberg 1958, p. 44) 





4 


Introduction 


The last quotation even opens Heisenberg’s only systematic presentation of the 
Copenhagen Interpretation, which forms Chapter III of his Gifford Lectures from 
1955; apparently this was the first occasion where the name “Copenhagen Interpre¬ 
tation” was used (Howard, 2004). In our view, several other defining claims of the 
Copenhagen Interpretation appear to be less well founded, if not unwarranted, al¬ 
though they may have been understandable in the historical context where they were 
first proposed (in which the new theory of quantum mechanics needed to get going 
even in the face of the foundational problems that all of the originators—including 
Bohr and Heisenberg—were keenly aware of). These spurious claims include; 

• The emphatic rejection of the possibility to analyze what is going on during mea¬ 
surements, as expressed in typical Bohr parlance by claims like: 

‘According to the quantum theory, just the impossibility of neglecting the interaction 
with the agency of measurement means that every observation introduces a new uncon¬ 
trollable element.’ (Bohr, 1928, p. 584), 

or, with similar (but somehow less off-putting) dogmatism by Heisenberg; 

‘So we cannot completely objectify the result of an observation’ (1958, p. 50). 

• The closely related interpretation of quantum-mechanical states (which Heisen¬ 
berg indeed referred to as “probability functions”) as mere catalogues of the prob¬ 
abilities attached to possible outcomes of experiments, as in: 

‘what one deduces from observation is a probability function, a mathematical expression 
that combines statements about possibilities or tendencies with statements about our 
knowledge of facts’ (Heisenberg 1958, p. 50), 

In addition, there are two ingredients of the avowed Copenhagen Interpretation Bohr 
and Heisenberg actually seem to have disagreed about. These include; 

• The collapse of the wave-function (i.e., upon completion of a measurement), 
which was introduced by Heisenberg (1927) in his paper on the uncertainty rela¬ 
tions. As we shall see in Chapter 11, this idea was widely adopted by the pioneers 
of quantum mechanics (and it still is), but apparently it was never endorsed by 
Bohr, who saw the wave-function as a “symbolic” expression (cf. Dieks, 2016a). 

• Bohr’s doctrine of Complementarity, which—though never precisely articulated— 
he considered to be a revolutionary philosophical insight of central importance to 
the interpretation of quantum mechanics (and even beyond). Heisenberg, on the 
other hand, regarded complementary descriptions (which Bohr saw as incompat¬ 
ible) as mathematically equivalent and at best paid lip-service to the idea. The 
reason for this discord probably lies in the fact that Heisenberg was typically 
guided by (quantum) theory, whereas Bohr usually started from experiments', 
Heisenberg once even referred to his mentor as a ‘philosopher of experiment’. 
Therefore, Heisenberg was satisfied that for example position and momentum 
were related by a unitary operator (i.e. the Fourier transform), whereas Bohr had 
the incompatible experimental arrangements in mind that were required to mea¬ 
sure these quantities. Their difference, then, contrasted theory and experiment. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



Introduction 


5 


Let us now review the philosophical motivation Bohr and Heisenberg gave for their 
mutual doctrine of classical concepts. First, Bohr (in his typical convoluted prose): 

‘The elucidation of the paradoxes of atomic physics has disclosed the fact that the unavoid¬ 
able interaction between the objects and the measuring instruments sets an absolute limit 
to the possibility of speaking of a behavior of atomic objects which is independent of the 
means of observation. We are here faced with an epistemological problem quite new in nat¬ 
ural philosophy, where all description of experience has so far been based on the assump¬ 
tion, already inherent in ordinary conventions of language, that it is possible to distinguish 
sharply between the behavior of objects and the means of observation. This assumption 
is not only fully justified by all everyday experience but even constitutes the whole basis 
of classical physics. (...) As soon as we are dealing, however, with phenomena like indi¬ 
vidual atomic processes which, due to their very nature, are essentially determined by the 
interaction between the objects in question and the measuring instruments necessary for 
the definition of the experimental arrangement, we are, therefore, forced to examine more 
closely the question of what kind of knowledge can be obtained concerning the objects. In 
this respect, we must, on the one hand, realize that the aim of every physical experiment— 
to gain knowledge under reproducible and communicable condition.s—leaves us no choice 
but to use everyday concepts, perhaps refined by the terminology of classical physics, not 
only in all accounts of the construction and manipulation of the measuring instruments but 
also in the description of the actual experimental results. On the other hand, it is equally 
important to understand that just this circumstance implies that no result of an experiment 
concerning a phenomenon which, in principle, lies outside the range of classical physics 
can be interpreted as giving information about independent properties of the objects.’ 

This text has been taken from Bohr (1958, p. 25), but very similar passages appear 
in many of Bohr’s writings from his famous Como Lecture (Bohr, 1928) onwards. 
In other words, the (supposedly) unavoidable interaction between the objects and 
the measuring instruments, which for Bohr represents the characteristic feature of 
quantum mechanics (and which we would now express in terms of entanglement, 
of which concept Bohr evidently had an intuitive grasp), threatens the objectivity 
of the description that is characteristic of (if not the defining property of) of classi¬ 
cal physics. However, this threat can be countered by describing quantum mechanics 
through classical physics, which (or so the argument goes) restores objectivity. Else¬ 
where, we see Bohr also insisting on the need for classical concepts in defining any 
meaningful theory whatsoever, as these are the only concepts we really understand 
(though, as he always insists, classical concepts are at the same time challenged by 
quantum theory, as a consequence of which their use is necessarily limited). 

Although Heisenberg’s arguments for the necessity of classical concepts start 
similarly, they eventually take a conspicuously different direction from Bohr’s: 

‘To what extent, then, have we finally come to an objective description of the world, espe¬ 
cially of the atomic world? In classical physics science started from the belief—or should 
one say from the illusion?—that we could describe the world or at least parts of the world 
without any reference to ourselves. This is actually possible to a large extent. We know that 
the city of London exists whether we see it or not. It may be said that classical physics 
is just that idealization in which we can speak about parts of the world without any ref¬ 
erence to ourselves. Its success has led to the general ideal of an objective description of 
the world. Objectivity has become the first criterion for the value of any scientific result. 
Does the Copenhagen interpretation of quantum theory still comply with this ideal? One 
may perhaps say that quantum theory corresponds to this ideal as far as possible. Certainly 
quantum theory does not contain genuine subjective features, it does not introduce the mind 
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of the physicist as a part of the atomic event. But it starts from the division of the world 
into the object and the rest of the world, and from the fact that at least for the rest of the 
world we use the classical concepts in our description. This division is arbitrary and his¬ 
torically a direct consequence of our scientific method; the use of the classical concepts is 
finally a consequence of the general human way of thinking. But this is already a reference 
to ourselves and in so far our description is not completely objective. (...) 

The concepts of classical physics are just a refinement of the concepts of daily life and are 
an essential part of the language which forms the basis of all natural science. Our actual 
situation in science is such that we do use the classical concepts for the description of the 
experiments, and it was the problem of quantum theory to find theoretical interpretation of 
the experiments on this basis. There is no use in discussing what could be done if we were 
other beings than we are. (...) 

Natural science does not simply describe and explain nature; it is a part of the interplay 
between nature and ourselves; it describes nature as exposed to our method of questioning.’ 
(Heisenberg, 1958, p. 55-56, 56, 81) 

The well-known last part may indeed have been the source of the crucial T’m the 
one who knocks’ episode in the superb tv-series Breaking Bad (whose criminal main 
character operates under the cover name of “Heisenberg”). This is worth mentioning 
here, because Heisenberg (and to a lesser extent also Bohr) displays a puzzling 
mixture between the hubris of claiming that quantum mechanics has restored Man’s 
position at the center of the universe and the modesty of recognizing that nonetheless 
Man has to know his limitations (in necessarily relying on the classical concepts he 
happens to be familiar with at the current state of evolution and science). 

Our own reasons for favoring the doctrine of classical concepts are threefold. 
The first is closely related to Heisenberg’s and may be expressed even better by the 
following passage from a book by the renowned Dutch primatologist Frans de 'Waal: 

‘Die Verwandlung [i.e.. The Metamorphosis by Franz Kafka, in which Gregor Samsa fa¬ 
mously wakes up to find himself transformed into an insect], published in 1915, was an 
unusual take-off for a century in which anthropocentrism declined. For metaphorical rea¬ 
sons, the author had picked a repulsive creature, forcing us from the first page onwards to 
feel what it would be like to be an insect. Around the same time, the German biologist 
Jakob von Uexkiill drew attention to the fact that each particular species has its own per¬ 
spective, which he called its Umwelt. To illustrate this new idea, Uexkiill took his readers 
on a tour through the worlds of various creatures. Each organism observes its environment 
in its own peculiar way, he argued. A tick, which has no eyes, climbs onto a grass blade, 
where it awaits the scent of butyric acid off the skin of mammals that pass by. Experiments 
have demonstrated that ticks may survive without food for as long as 18 years, so that a tick 
has ample time to wait for her prey, jump on it, and suck its warm blood, after which she 
is ready to lay her eggs and die. Are we in a position to understand the Umwelt of a tick? 

Its seems unbelievably poor compared to ours, but Uexkiill regarded its simplicity rather as 
a strength: ticks have set themselves a narrow goal and hence cannot easily be distracted. 
Uexkiill analysed many other examples, and showed how a single environment offers hun¬ 
dreds of different realities, each of which is unique for some given species. (...) Some 
animals merely register ultraviolet light, others live in a world of odors, or of touch, like a 
star nose mole. Some animals sit on a branch of an oak, others live underneath the bark of 
the same oak, whilst a fox family digs a hole underneath its roots. Each animal observes the 
tree differently.’ (De Waal, 2016, pp. 15-16. Translation by the author). 

Indeed, it is hardly an accident that De Waal preceded this passage by a quotation 
from Heisenberg almost identical to the last one above. 
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A second argument in favour of the doctrine lies in the possibility of a peaceful 
outcome of the Bohr-Einstein debate, or at least of an important part of it; cf. Lands¬ 
man (2006a), which was inspired by earlier work of Raggio (1981, 1988) and Bac- 
ciagaluppi (1993). This debate initially centered on Einstein’s attempts to debunk 
the Heisenberg uncertainty relations, and subsequently, following Einstein’s grudg¬ 
ing acceptance of their validity, entered its most famous and influential phase, in 
which Einstein tried to prove that quantum mechanics, although admittedly correct, 
was incomplete. One could argue that both antagonists eventually lost this part of 
the debate, since Einstein’s goal of a local realistic (quantum) physics was quashed 
by the famous work of Bell (1964), whereas against Bohr’s views, deterministic ver¬ 
sions of quantum mechanics such as Bohmian mechanics and the Everett (i.e. Many 
Worlds) Interpretation turned out to be at least logical possibilities. 

However incompatible the views of Einstein and Bohr on physics and its goals 
may have been, unknown to them a common battleground did in fact exist and could 
even have led to a reconciliation of at least the epistemological views of the great ad¬ 
versaries. The common ground referred to concerns the problem of objectification, 
which at first sight Bohr and Einstein approached in completely different ways: 

• Bohr objectified a quantum system through the specification of a classical exper¬ 
imental context, i.e. by looking at it through appropriate classical glasses. 

• Einstein objectified any physical system by claiming its independent existence: 

‘The belief in an external world independent of the perceiving subject is the basis of all 
natural science.’ (Einstein, 1954, p. 266). 

On a suitable mathematical interpretation, these conditions for the objectification 
of the system turn out to be equivalent! Namely, identifying Bohr’s apparatus with 
Einstein’s perceiving subject, calling its algebra of observables A, and denoting the 
algebra of observables of the quantum system to be objectified by B, our reading of 
the doctrine of classical concepts (to be explained in more detail below) is simply 
that A be commutative. Einstein, on the other hand, insists that the system under 
observation has its own state, so that there must be no entangled states on the tensor 
product A (g)B that describes the composite system. Equivalently, every pure state on 
A®B must be a product state, so that both A and B have states that together deter¬ 
mine the joint state of A 0 B. This is the case if and only if A or B is commutative, 
and since B is taken to be a quantum system, it must be A (see the notes to §6.5 for 
details). Thus Bohr’s objectification criterion turns out to coincide with Einstein’s! 

Thirdly, the doctrine of classical concepts describes all known applications to 
date of quantum theory to experimental physics; and therefore we simply have to 
use it if we are interested in understanding these applications. This is true for the 
entire range of empirically accessible energy and length scales, from molecular and 
condensed matter physics (including quantum computation) to high-energy physics 
(in colliders as well as in the context of astro-particle physics). So if people working 
in a field like quantum cosmology complain about the Copenhagen Interpretation 
then perhaps they should ask themselves if their field is more than a chimera. 

Given its clear empirical relevance, it is a moot point whether the doctrine of 
classical concepts is as necessary as Bohr and Heisenberg claimed it was: 
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‘In their attempts to formulate the general content of quantum mechanics, the representa¬ 
tives of the Copenhagen School often used formulations with which they do not merely 
say how things are in their opinion, but beyond that, they say that things must be thus and 
so (...) They chose formulations for the mere communication of an item in which at the 
same time the inevitability of what is communicated is asserted. (...) The assertion of the 
necessity of a proposition adds nothing to its content.’ (Scheibe, 2001, pp. 402-403) 

The doctrine of classical concepts implies in particular that the measuring appa¬ 
ratus is to be described classically; indeed, along with its coupling to the system 
undergoing measurement, it is its classical description which turns some device— 
which a priori is a quantum system like anything else—into a measuring apparatus. 
This point was repeated over and over by Bohr and Heisenberg, but in our view the 
clearest explanation of this crucial point has been given by Scheibe: 

‘It is necessary to avoid any misunderstanding of the buffer postulate [i.e., the doctrine 
of classical concepts], and in particular to emphasize that the requirement of a classical 
description of the apparatus is not designed to set up a special class of objects differing 
fundamentally from those which occur in a quantum phenomenon as the things examined 
rather than measuring apparatus. This requirement is essentially epistemological, and af¬ 
fects this object only in its role as apparatus. A physical object which may act as apparatus 
may in principle also be the thing examined. (...) The apparatus is governed by classical 
physics, the object by the quantum-mechanical formalism.’ (Scheibe, 1973, p. 24—25) 

Thus it is essential to the Copenhagen Interpretation that one can describe at least 
some quantum-mechanical devices classically: those for which this is possible in¬ 
clude the candidate-apparatuses (i.e. measuring devices). In view of its importance 
for their interpretation of quantum mechanics, it is remarkable how little Bohr, 
Heisenberg, and their followers did to seriously address this problem of a dual de¬ 
scription of at least part of the world, although they were clearly aware of this need: 

‘In the system to which the quantum mechanical formalism is to be applied, it is of course 
possible to include any intermediate auxiliary agency employed in the measuring process. 
Since, however, all those properties of such agencies which, according to the aim of mea¬ 
surements have to be compared with the corresponding properties of the object, must be 
described on classical lines, their quantum mechanical treatment will for this purpose be 
essentially equivalent with a classical description.’ (Bohr, 1939, pp. 23-24; quotation taken 
from Camilleri & Schlosshauer, 2015, p. 79) 

In defense of this alleged equivalence, we read almost circular explanations like: 

‘the necessity of basing the description of the properties and manipulation of the measur¬ 
ing instruments on purely classical ideas implies the neglect of all quantum effects in that 
description.’ (Bohr, 1939, p. 19) 

Since it delineates an appropriate regime, the following is slightly more informative: 

‘Incidentally, it may be remarked that the construction and the functioning of all apparatus 
like diaphragms and shutters, serving to define geometry and timing of the experimental 
arrangements, or photographic plates used for recording the localization of atomic objects, 
will depend on properties of materials which are themselves essentially determined by the 
quantum of action. Still, this circumstance is irrelevant for the study of simple atomic phe¬ 
nomena where, in the specification of the experimental conditions, we may to a very high 
degree of approximation disregard the molecular constitution of the measuring instruments. 
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If only the instruments are sufficiently heavy compared with the atomic objects under inves¬ 
tigation, we can in particular neglect the requirement of the [uncertainty] relation as regards 
the control of the localization in space and time of the single pieces of the apparatus relative 
to each other. (Bohr, 1948, pp. 315-316). 

Even Heisenberg restricted himself to very general comments like: 

‘This follows mathematically from the fact that the laws of quantum theory are for the 
phenomena in which Planck’s constant can be considered as a very small quantity, approx¬ 
imately identical with the classical laws. (Heisenberg, 1958, pp. 57). 

Notwithstanding these vague or even circular explanations, the connection between 
classical and quantum mechanics was at the forefront of research in the early days 
of quantum theory, and even predated quantum mechanics. For example. Jammer 
(1966, p. 109) notes that already in 1906 Planck suggested that 

‘the classical theory can simply be characterized by the fact that the quantum of action 
becomes infinitesimally small.’ 

In fact, in the same context as Planck, namely his radiation formula, Einstein made 
a similar point already in 1905. Subsequently, Bohr’s Correspondence Principle, 
which originated in the context of atomic radiation, suggested an asymptotic re¬ 
lationship between quantum mechanics and classical electrodynamics. As such, it 
played a major role in the creation of quantum mechanics (Bohr, 1976, Jammer, 
1966, Mehra & Rechenberg, 1982; Hendry, 1984; Darrigol, 1992), but the contem¬ 
porary (and historically inaccurate) interpretation of the Correspondence Principle 
as the idea that all of classical physics should be a certain limiting case of quantum 
physics seems of much later date (cf. Landsman, 2007a; Bokulich, 2008). 

Ironically, the possibility of giving a dual classical-quantum description of mea¬ 
surement apparatuses, though obviously crucial for the consistency of the Copen¬ 
hagen Interpretation, simply seems to have been taken for granted, whereas also the 
more ambitious problem of explaining at least the appearance of the classical world 
(i.e. beyond measurement devices) from quantum theory—which is central to cur¬ 
rent research in the foundations of quantum mechanics—is not to be found in the 
writings of Bohr (who, after all, saw the explanation of experiments as his job). 

Perhaps Heisenberg could have used the excuse that he regarded the problem as 
solved by his 1927 paper on the uncertainty relations; but on both technical and con¬ 
ceptual grounds it would have been a feeble excuse. One of the few expressions of at 
least some dissatisfaction with the situation from within the Copenhagen school—if 
phrased ever so mildly—came from Bohr’s former research associate Landau: 

‘Thus quantum mechanics occupies a very unusual place among physical theories: it con¬ 
tains classical mechanics as a limiting case, yet at the same time it requires this limiting 
case for its own formulation.’ (Landau & Lifshitz, 1977, p. 3) 

In other words, the relationship between the (generalized) Correspondence Principle 
and the doctrine of classical concepts needs to be clarified, and such a clarification 
should hopefully also provide the key for the solution of the grander problem of 
deriving the classical world from quantum theory under appropriate conditions. 
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As a first step to this end, Bohr’s conceptual ideas should be interpreted within 
the formalism of quantum mechanics before they can be applied to the physical 
world, an intermediate step Bohr himself seems to have considered superfluous: 

‘I noticed that mathematical clarity had in itself no virtue for Bohr. He feared that the 
formal mathematical structure would obscure the physical core of the problem, and in any 
case, he was convinced that a complete physical explanation should absolutely precede the 
mathematical formulation.’ (Heisenberg, 1967, p. 98) 

Fortunately, von Neumann did not return the compliment, since beyond its brilliant 
mathematical content, his Mathematische Grundlagen der Quantenmechanik from 
1932 devoted considerable attention to conceptual issues. For example, he gave the 
most general form of the Born rule (which is the central link between experimen¬ 
tal physics and the Hilbert space formalism), he introduced density operators for 
quantum statistical mechanics (which are still in use), he conceptualized projection 
operators as yes-no questions (paving the way for his later development of quantum 
logic with Birkhoff, as well as for Gleason’s Theorem and the like), in his analysis 
of hidden variables he introduced the mathematical concept of a state that became 
pivotal in operator algebras (including the algebraic approach to quantum mechan¬ 
ics), en passant also preparing the ground for the theorems of Bell and Kochen & 
Specker (which exclude hidden variables under physically more relevant assump¬ 
tions than von Neumann’s), and, last but not least, his final chapter on the measure¬ 
ment problem formed the basis for all serious subsequent literature on this topic. 

Nonetheless, much as Bohr’s philosophy of quantum mechanics would benefit 
from a precise mathematical interpretation, von Neumann’s mathematics would be 
more effective in physics if it were supplemented by sound conceptual moves (be¬ 
yond the ones he provided himself). Killing two birds with one stone, we implement 
the doctrine of classical concepts in the language of operator algebras, as follows: 

The physically relevant aspects of the noncommutative operator algebras of quantum- 
mechanical observables are only accessible through commutative algebras. 

Our Bohrification program, then, splits into two parts, which are distinguished by 
the precise relationship between a given noncommutative operator algebra A (rep¬ 
resenting the observables of some quantum system, as detailed below) and the com¬ 
mutative operator algebras (i.e. classical contexts) that give physical access to A. 

While delineated mathematically, these two branches also reflect an unresolved 
conceptual disagreement between Bohr and Heisenberg about the status of clas¬ 
sical concepts (Camilleri, 2009b). According to Bohr—^haunted by his idea of 
Complementarity—only one classical concept (or one coherent family of classi¬ 
cal concepts) applies to the experimental study of some quantum object at a time. 
If it applies, it does so exactly, and has the same meaning as in classical physics; 
in Bohr’s view, any other meaning would be undefined. In a different experimental 
setup, some other classical concept may apply. Examples of such “complementary” 
pairs are particle versus wave (an example Bohr stopped using after a while), space- 
time description versus “causal description” (by which Bohr means conservation 
laws), and, in his later years, one “phenomenon” (i.e., an indivisible unit of a quan¬ 
tum object plus an experimental arrangement) against another. For example: 
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‘My main purpose (...) is to emphasize that in the phenomena concerned we are (...) deal¬ 
ing with a rational discrimination between essentially different experimental arrangements 
and procedures which are suited either for an unambiguous use of the idea of space loca¬ 
tion, or for a legitimate application of the conservation theorem of momentum (...) which 
therefore in this sense may be considered as complementary to each other (...) Indeed we 
have in each experimental arrangement suited for the study of proper quantum phenomena 
not merely to do with an ignorance of the value of certain physical quantities, but with the 
impossibility of defining these quantities in an unambiguous way. (Bohr, 1935, p. 699). 

Heisenberg, on the other hand, seems to have held a more relaxed attitude towards 
classical concepts, perhaps inspired by his famous 1925 paper on the quantum- 
mechanical reinterpretation (Umdeutung) of mechanical and kinematical relations, 
followed by his equally great paper from 1927 already mentioned. In the former, 
he introduced what we now call quantization, in putting the observables of classical 
physics (i.e. functions on phase space) on a new mathematical footing by turning 
them into what we now call operators (initially in the form of infinite matrices), 
where they also have new properties. In the latter, Heisenberg tried to find some op¬ 
erational meaning of these operators through measurement procedures. Since quan¬ 
tization applies to all classical observables at once, all classical concepts apply si¬ 
multaneously, but approximately (ironically, like most research on quantum theory 
at the time, the 1925 paper was inspired by Bohr’s Correspondence Principle). 

To some extent, then, Bohr’s view on classical concepts comes back mathemati¬ 
cally in exact Bohrification, which studies (unital) commutative C*-subalgebras C 
of a given (unital) noncommutative C*-algebra A, whereas Heisenberg’s interpreta¬ 
tion of the doctrine resurfaces in asymptotic Bohrification, which involves asymp¬ 
totic inclusions (more specifically, deformations) of commutative C*-algebras into 
noncommutative ones. So the latter might have been called Heisenbergification in¬ 
stead, but in view of both the ugliness of this word and the historical role played by 
Bohr’s Correspondence Principle just alluded to, the given name has stuck. 

The precise relationship between Bohr’s and Heisenberg’s views, and hence also 
between exact and asymptotic Bohrification, remains to be clarified; their joint ex¬ 
istence is unproblematic, however, since the two programs complement each other. 

• Exact Bohrification turns out to be an appropriate framework for: 

- The Born rule (for single case probabilities). 

- Gleason’s Theorem (which justifies von Neumann’s notion of a state as a pos¬ 
itive linear expectation value, assuming the operator part of quantum theory). 

- The Kochen-Specker Theorem (excluding non-contextual hidden variables). 

- The Kadison-Singer Conjecture (concerning uniqueness of extensions of pure 
states from maximal commutative C*-subalgebras of the algebra B{H) of all 
bounded operators on a separable Hilbert space H to B{H)). 

- Wigner’s Theorem (on unitary implementation of symmetries of pure states 
with transition probabilities, and its analogues for other quantum structures). 

- Quantum logic (which, if one adheres to the doctrine of classical concepts, 
turns out to be intuitionistic and hence distributive, rather than orthomodular). 

- The topos-theoretic approach to quantum mechanics (which from our point 
of view encompasses quantum logic and implies the preceding claim). 
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• Asymptotic Bohrification, on the other hand, provides a mathematical setting for: 

- The classical limit of quantum mechanics. 

- The Born rule (for probabilities measured as long-run frequencies). 

- The infinite-volume limit of quantum statistical mechanics. 

- Spontaneous symmetry breaking (SSB). 

- The Measurement Problem (highlighted by Schrddinger’s Cat). 

On the philosophical side, the limiting procedures inherent in asymptotic Bohrifi¬ 
cation may be seen in the light of the (alleged) phenomenon of emergence. From 
the philosophical literature, we have distilled two guiding thoughts which, in our 
opinion, should control the use of limits, idealizations, and emergence in physics 
and hence play a paramount role in this book. The first is Barman’s Principle: 

‘While idealizations are useful and, perhaps, even essential to progress in physics, a sound 

principle of interpretation would seem to be that no effect can be counted as a genuine 

physical effect if it disappears when the idealizations are removed.’ (Barman, 2004, p. 191) 

The second is Butterfield’s Principle, which in a sense is a corollary to Barman’s 
Principle, and should be read in the light of Butterfield’s own definition of emer¬ 
gence as ‘behaviour that is novel and robust relative to some comparison class’, 
which among other virtues removes the reduction-emergence opposition: 

“there is a weaker, yet still vivid, novel and robust behaviour that occurs before we get to 

the limit, i.e. for finite N. And it is this weaker behaviour which is physically real.” 

(Butterfield, 2011, p. 1065) 

Indeed, the link between theory and reality stands or falls with an adherence to these 
principles, for real materials (like a ferromagnet or a cat) are described by the quan¬ 
tum theory of finite systems (i.e., h > 0 or N < °°, as opposed to their idealized 
limiting cases h — 0 or N = and yet they do display the remarkable phenom¬ 
ena that strictly speaking are only possible in the corresponding limit theories, like 
symmetry breaking, or the fact that cats are either dead or alive, as a metaphor for 
the fact that measurements have outcomes. This simple observation shows that any 
physically relevant conclusion drawn from some idealization must be foreshadowed 
in the underlying theory already for positive values of h or finite values of N. 

Despite their obvious validity, it is remarkable how often idealizations violate 
these principles. For example, all rigorous theories of spontaneous symmetry break¬ 
ing in quantum statistical mechanics (Bratteli & Robinson, 1981) and in quantum 
field theory (Haag, 1992) strictly apply to infinite systems only, since ground states 
of finite quantum systems are typically unique (and hence symmetric), whilst ther¬ 
mal equilibrium states of such systems are even always unique (see also Chapter 
10). As explained in Chapter 11, the “Swiss” approach to the measurement problem 
based on superselection rules faces a similar problem, and must be discarded for that 
reason. Bohr’s doctrine of classical concepts is particularly vulnerable to Barman’s 
Principle, since classical physics (in whose language we are supposed to express the 
account of all evidence) is not realized in nature but only in the human mind, so to 
speak. This necessitates great care in implementing this doctrine. 
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Interestingly, in his famous lecture “Uber das Unendliche”, in which he ex¬ 
pounded his finitary program intended to save mathematics against the devilish in- 
tuitionist challenge of L.E.J. Brouwer, Hilbert (1925) expressed similar principles 
controlling the use of infinite idealizations in mathematics: 

“Und so wie bei den Grenzprozessen der Infinitesimalrechnung das Unendliche im Sinne 
des Unendlichkleinen und des UnendlichgroBen sich als eine bloBe Redensart erweisen lieB, 
so mlissen wir auch das Unendliche im Sinne der Unendlichen Gesamtheit, wo wir es jetzt 
noch in den SchluBweisen vorfinden, als etwas bloB scheinbaren erkennen. Und so wie das 
Operieren mit dem Unendlichkleinen dutch Prozesse im Endlichen ersetzt wurde, welche 
ganz dasselbe leisten und zu ganz denselben eleganten formalen Beziehungen flihren, so 
miissen liberhaupt die SchluBweisen mit dem Unendlichen dutch endliche Ptozesse etsetzt 
wetden, die getade dasselbe leisten, d.h. dieselben Beweisgange und dieselben Methoden 
det Gewinning von Fotmeln und Satzen etmoglichen.” (Hilbett, 1925, p. 162).® 

In addition, asymptotic Bohrification has three rather more technical roots: 

1. A new approach to quantization theory developed in the 1970s under the name 
of deformation quantization (Berezin, 1975; Bayen et al, 1978), where the non- 
commutative algebras characteristic of quantum mechanics arise as deforma¬ 
tions of Poisson algebras. In Rieffel’s (1989, 1994) approach to deformation 
quantization, further developed in Landsman (1998a), the deformed algebras are 
C*-algebras, and hence the apparatus of operator algebras and noncommutative 
geometry (Connes, 1994) becomes available. Deformation quantization gives a 
mathematically precise and physically relevant meaning to the limit It —> 0, and 
shows that quantization and the classical limit are two sides of the same coin. 

2. The mathematical analysis of the BCS-model of superconductivity initiated by 
Bogoliubov (1958) and Haag (1962), which, in the more general setting of mean- 
field models of solid state physics, culminated in the work of Bona (1988, 2000), 
Raggio & Werner (1989), and Duffield & Werner (1992). These authors showed 
that in the macroscopic limit N ^ non-commutative algebras of quantum- 
mechanical observables (which are typically tensor powers of matrix algebras 
Mn{C)) converge to some commutative algebra (typically consisting of all con¬ 
tinuous functions on the state space of M„(C)), at least for macroscopic averages. 

3. The role of low-lying states and the ensuing instability of ground states under tiny 
perturbations in the two limits at hand, discovered by Jona-Lasinio, Martinelli, (& 
Scoppola (1981) for the classical limit h ^0, and by Koma &Tasaki (1994) for 
the macroscopic limit A —oo. In combination with the previous items, this led to 
a new approach to the measurement problem (Landsman & Reuvers, 2013) and 
to spontaneous symmetry breaking and emergence (Landsman, 2013), which in 
particular addresses these issues in the framework of asymptotic Bohrification. 

® ‘Just as in the limit processes of the infinitesimal calculus, the infinite in the sense of the infinitely 
large and the infinitely small proved to be merely a figure of speech, so too we must realize that 
the infinite in the sense of an infinite totality, where we still find it in deductive methods, is an 
illusion. Just as operations with the infinitely small were replaced by operations with the finite 
which yielded exactly the same results and led to exactly the same elegant formal relationships, 
so in general must deductive methods based on the infinite be replaced by finite procedures which 
yield exactly the same results, i.e., which make possible the same chains of proofs and the same 
methods of getting formulas and theorems.’ (Benaceraff & Putnam, 1983, p. 184). 
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This book is organized into two parts. Rather than following the partition of 
our approach into exact and asymptotic Bohrification, these parts reflect the (math¬ 
ematical) sophistication of the material, starting with finite sets, and ending with 
a combination of C*-algebras and topos theory. Part I, called Co(.lf) and B{H), 
gives a mathematical introduction to both classical and quantum mechanics from 
an operator-algebraic point of view, in which these theories are kept separate, whilst 
mathematical analogies are stressed whenever possible. This part emphasizes the 
notion of symmetry, and includes some of the main abstract mathematical results 
about quantum mechanics (i.e., those not involving the study of Schrodinger op¬ 
erators and concrete models), such as the Born rule, the theorems of Gleason and 
Kochen & Specker already mentioned, the one of Wigner (on symmetries) and its 
numerous derivatives, including a new one on unitary implementability of symme¬ 
tries of the poset '^(B{H)) of unital commutative C*-subalgebras of B{H), and 
Stone’s Theorem on unitary implementability of time evolution in quantum me¬ 
chanics. This part may also serve as a reference for such fundamental theorems 
about quantum mechanics. An unusual ingredient of this part is our discussion of 
the Kadison-Singer Conjecture, included because of its fit into (exact) Bohrification. 
Also elsewhere, results are (re)phrased in a language appropriate to this ideology. 

Experts in the C*-algebraic approach to quantum mechanics will be able to read 
the second part independently of the first (which they might therefore skip if they 
find it to be too elementary), but the spirit of Bohrification will only be instilled in 
the reader if (s)he reads the entire book; indeed, it is this very spirit that keeps the 
two parts together and turns the book into a whole. Part II, entitled Between Co{X) 
and B(H), starts with a survey of some known results on the grey area between clas¬ 
sical and quantum, such as Bell’s Theorem(s) and the so-called Free Will Theorem. 
It then embarks on the asymptotic Bohrification program, including (deformation) 
quantization and the classical limit (including a small excursion into indistinguish¬ 
able particles), large systems and their (thermodynamic) limit, and the Born rule 
(revisited). This part centers on a somewhat idiosyncratic treatment of spontaneous 
symmetry breaking (SSB) and the closely related measurement problem of quan¬ 
tum mechanics, which is given an unusual but technically precise formulation in the 
spirit of the Copenhagen Interpretation, and hence is meant to be relevant to actual 
experimental physics (which is what the Copenhagen Interpretation covers). 

Our treatment of both quantization and SSB relies mathematically on continu¬ 
ous bundles of C*-algebras, while the principles of Barman and Butterfield provide 
philosophical guidance. This is also true for our approach to the measurement prob¬ 
lem, which combines elements of quantization and SSB. Although experiments and 
detailed theoretical models are lacking so far, this powerful combination of mathe¬ 
matical and philosophical tools leads to a compelling scenario for solving the mea¬ 
surement problem, harboring the hope of finally laying this problem to rest. Like 
dynamical collapse models that require modifications of quantum mechanics, our 
scenario looks at the wave-function realistically, and hence describes measurement 
as a physical process, including the collapse that settles the outcome (as opposed to 
reinterpretations of the uncollapsed state, as in modal or Everettian interpretations). 
However, in our approach collapse takes place within unitary quantum theory. 
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Insolubility theorems for the measurement problem are circumvented, because 
these rely on the counterfactual that if xj/^ were the initial state, then for each n it 
would evolve (linearly) according to the Schrodinger equation with given Hamilto¬ 
nian h, whereas if the initial state were Y.nCn'^fn, also then it would evolve accord¬ 
ing to the same Hamiltonian h. However, Butterfield’s Principle implies that this 
counterfactual is inapplicable precisely in the measurement situations it is meant 
for, because the dual description of the apparatus as both classical and quantum- 
mechanical causes extreme sensitivity of the wave-function to even the tiniest per¬ 
turbations of the Hamiltonian. Indeed, such perturbations dynamically enforce some 
particular outcome of the measurement. Our scenario also rejects the typical way of 
looking at measurement as a two-step process (going back to von Neumann himself 
and widely adopted in the literature ever since), i.e., of firstly a transition of a pure 
state to a mixed one (this is his ill-fated “process 1”), followed by the registration of 
a single outcome. In real measurements (like elsewhere), pure states remain pure! If 
our scenario is correct, the mistaken impression that quantum theory seems to imply 
the irreducible randomness of nature, then arises because measurement outcomes 
are merely unpredictable “for all practical purposes”, indeed they are unpredictable 
in a way that dwarfs even the apparent randomness of classical chaotic systems. 

The final chapter on topos theory and quantum logic elaborates on ideas originat¬ 
ing with Isham and Butterfield. It centers on the poset ^{A) of all unital commuta¬ 
tive C*-subalgebras of a unital C*-algebra A, ordered by inclusion; with some good¬ 
will, one might call (A) the mathematical home of Complementarity (although the 
construction applies even when A itself is commutative). The power of this poset is 
already clear in Part I, where the special case A = B{H) leads to a new version of 
Wigner Theorem on unitary implementability of symmetries. Hamhalter’s Theorem, 
which is a far-reaching generalization of this version, then shows that ^{A) carries 
at least as much information about A as the pure state space. Furthermore, ^{A) 
enforces a (new) notion of quantum logic that turns out to be intuitionistic in being 
distributive but denying the law of the excluded middle (on which both classical 
logic and the non-distributive quantum logic of Birkhoff-von Neumann are based). 
Finally, “^(A) gives rise to a quantum phase space (which is lacking in the usual 
formalism), on which observables are functions and states are probability measures, 
just like in classical physics (but now “internal” to a particular topos, i.e., a mathe¬ 
matical universe alternative to set theory, in which logic is typically intuitionistic). 

About a third of the book is devoted to mathematical appendices. Those on func¬ 
tional analysis and operator algebras give thorough introductions to these subjects, 
sparing the reader the effort to study books like Bratteli & Robinson (1981), Con¬ 
way (2007), Dudley (1989), Kadison & Ringrose (1983, 1986), Lance (1995), Ped¬ 
ersen (1989), Reed & Simon (1972), Schmiidgen (2012), and Takesaki (2002,2003). 
The appendices on logic, category theory, and topos theory, on the other hand, are 
far from exhaustive (though self-contained): they provide a shortcut to the neces¬ 
sary parts of e.g. Johnstone (1987), Mac Lane (1998), and Mac Lane & Moerdijk 
(1992), or, alternatively, of Bell Sc Machover (1977) and Bell (1988). Though pri¬ 
marily meant to support the main body of the book, these appendices may also be of 
some interest by themselves, especially to philosophers, but even to mathematicians. 
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As a “Quick Start Guide” for readers in a hurry, we now summarize the main 
definitions in the theory of operator algebras. A C*-algebra is an associative algebra 
(over C) equipped with an involution (i.e., a real-linear map ai-^ a* such that 

a =a, [abj =b a , [Aa) = Aa , 

for all fl, G A and A G C), as well as a norm in which A is complete (i.e., a Banach 
space), such that algebra, involution, and norm are related by the axioms 


= ll«f ■ 

The two main classes of C*-algebras are: 

• The space Co (X) of all continuous functions f :X that vanish at infinity (i.e., 
for any e > 0 the set {x GX \ \f{x)\ > e} is compact), where X is some locally 
compact Hausdorff space, with pointwise addition and multiplication, involution 

fix) = fix), 


and a norm 

||/||oo = sup{|/(x)|}. 
xex 

It is of fundamental importance for physics and mathematics that Co{X) is com¬ 
mutative. Conversely, Gelfand & Naimark (1943) proved that every commutative 
C*-algebra is isomorphic to Co(^) for some locally compact Hausdorff space X, 
which is determined by A up to homeomorphism (X is called the Gelfand spec¬ 
trum of A). Note that Co (A) has a unit (i.e. the function lx that is equal to 1 for 
any x) iff X is compact. 

• Norm-closed subalgebras A of the space B{H) of all bounded operators on some 
Hilbert space H for which a* G A iff a G A; this includes the case A = B(H). Here 
one uses the standard operator norm 

||a|| =sup{||flV^|l,V/G//,||v/|| = 1}, 

the algebraic operations are the natural ones, and the involution is the adjoint. 
If dim{H) > 1, B{H) is a non-commutative C*-algebra. An important special 
case is the C*-algebra Bq{H) of all compact operators on H, which has no unit 
whenever H is infinite-dimensional (whereas B{H) is always unital). In their 
fundamental paper, Gelfand & Naimark (1943) also proved that every C*-algebra 
is isomorphic to A C B{H) for some Hilbert space space X. 

These classes are related as follows: in the commutative case A = Co(A), take 

H = f(X,p), 

where the support of the measure p is X, on which Co(A) acts by multiplication 
operators, that is, mfip = /xp, where / G Co(A) and xp G fiX,p). 


T^txLtSiXLMtXLtjJCjaJ. T^lLy-A-LC-A. 
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As already noted, C*-algebras were introduced by Gelfand & Naimark (1943), 
generalizing the rings of operators studied by von Neumann during 1930-1949, 
partly in collaboration with Murray (von Neumann, 1930, 1931, 1938, 1940, 1949; 
Murray & von Neumann, 1936, 1937, 1943). These rings are now called von Neu¬ 
mann algebras, and arise as the special case where a C*-algebra A C B{H) satisfies 

A=A", 

in which for any subset S C B{H) the commutant of S is defined by 
S' = {a€ B{H) I ab = boNb G 5}, 

in terms of which the bicommutant of S is given by S" = {S')'. Equivalently, a C*- 
algebra is a von Neumann algebra M iff it is the dual of some Banach space M* 
(which is unique, and contains the so-called normal states on M). 

Generalizing von Neumann’s concept of a state on B{H), a state on a C*-algebra 
A (as first defined by Segal in 1947) is a linear map 

(o A —^ C 


that is positive in that 

(o{a*a) > 0 

for each a GA, and normalized in that, noting that positivity implies boundedness, 

ll®ll = l, 

where || • || is the usual norm on the Banach dual A*. If A has a unit 1^, then in the 
presence of positivity, the above normalization condition is equivalent to 


®(1a) = 1- 


The Riesz-Radon representation theorem in measure theory gives a bijective corre¬ 
spondence between states O) on A = Co(7f) and probability measures p on X, viz. 

co{f ) = [ dpf, 

Jx 

for any / G Co{X). At the other end of the operator-algebraic world, if A = B{H), 
then any density operator p onH gives a state (O on B{H) by 

(0{a) = Tr (pa), 

but if H is infinite-dimensional there are other states, which cannot be normal. Such 
“singular” states are the C*-algebraic analogues of improper eigenstates for eigen¬ 
values in the continuous spectrum of some self-adjoint operator (think of position or 
momentum), and hence they make perfect sense physically. Singular states play an 
important role also mathematically, especially in the Kadison-Singer Conjecture. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



18 


Introduction 


Let me close this Introduction with a small personal note on the way this book 
came into being. Of the three disciplines relevant to the foundations of physics, 
namely mathematics, physics, and philosophy, my expertise has always been lo¬ 
cated within the first two, more specifically in mathematical physics. Nonetheless, 
my interest in the foundations of physics was triggered already at school, notably 
by books like The Dancing Wu-Li Masters by Gary Zukav, The Tao of Physics by 
Fritjof Capra (both of which may appear suspicious in hindsight), and especially 
by Werner Heisenberg’s fascinating (though historically unreliable) autobiography 
Physics and Beyond (called Der Teil und das Ganze in German). The second auto¬ 
biography that made a huge impression on me at the time was Bertrand Russell’s, 
which in particular made me want to go to Cambridge and become a so-called Apos¬ 
tle (i.e. a member of an elitist secret conversation society that once included such 
illustrious members as Moore, Keynes, Hardy, and Russell himself); the first dream 
was eventually realized (see below), about the second I have to remain silent. 

My interest in foundations was reinforced by two books on general relativity 
which I read as a first-year physics student, namely Raum ■ Zeit ■ Materie by Weyl 
(1918) and The Mathematical Theory of Relativity by Eddington (1923). Although 
these were beyond my grasp at the time, they were clearly written in the spirit of 
Newton’s Principia, in that they were primarily treatises in natural philosophy, for 
which mathematical physics just provided the technical underpinning. Nonetheless, 
despite an unforgettable seminar by Jan Hilgevoord on the Heisenberg uncertainty 
relations in 1984, reporting on his recent joint work with Jos Uffink, foundations 
remained dormant during my undergraduate and PhD years (1981-1989). 

As a postdoc in Cambridge from 1989 onwards, I initially attended all seminars 
in any subject related to mathematics and/or physics I found remotely interesting, 
including the so-called Sigma Club, which at the time was organized by Michael 
Redhead. Michael was surrounded by a group of people I began to increasingly like, 
although I was and still am worried by their deification of John Bell (one speaker 
even asked his audience to stand whilst he was reading a passage from Speakable 
and Unspeakable in Quantum Mechanics). In any case, I was very kindly invited 
to speak at the Sigma Club on my recent paper on superselection rules and the 
measurement problem (whose approach I now eschew, since it violates Barman’s 
Principle, see above as well as Chapter 11 below), followed by a private dinner in 
the posh Riverside Restaurant with Michael (who asked my opinion about David 
Lewis, whom I unfortunately had never heard of). Indeed, the generosity of inviting 
an absolute beginner in the philosophy of physics to speak in such a prestigious 
seminar endeared me even further to both the subject and the community. 

My main business remained mathematical physics, but, reinforcing the earlier 
spark I had got from reading Weyl and Eddington (and later also from von Neumann 
as well as Newton), two people (unfortunately no longer with us) made it clear to 
me that the goal of this discipline may include not only mathematics and physics, 
but also foundations, i.e., natural philosophy. These were Rob Clifton, who was a 
PhD student of Redhead and Butterfield, and Rudolf Haag, in whose group I had 
the honour to work during my year at Hamburg (1993-1994) as an Alexander von 
Humboldt Bellow (this was Haag’s last active year at the university, cf Haag, 2010). 
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My first book in 1998, which I wrote during my last two years at Cambridge, 
when the prospect of having to leave Academia and hence the urge to leave a per¬ 
manent record loomed large, did not yet reflect this attitude. But my lengthy article 
on the classical-quantum interface in the Handbook of the Philosophy of Physics 
edited by Butterfield and Barman already did, and so does the present book. 

There is an inherent danger in a mathematical physics approach to foundations: 

‘I’m guided by the beauty of our weapons’ (Leonard Cohen) 

Our mathematical weapons, that is; this book is predicated on the idea that operator 
algebras provide the right language for quantum theory. If they don’t—for example, 
if path integrals are really its essence, as researchers especially in quantum gravity 
seem to believe, and there turns out to be a difference between the two toolkits—the 
mathematical underpinning of Bohrification would fall. Since our conceptual pro¬ 
gram is closely linked to this mathematical language, it would presumably collapse, 
too. Even if operator algebras stand, once some noncommutative alien gets direct 
access to the quantum world in defiance of Bohr’s doctrine of classical concepts, the 
conceptual framework behind Bohrification (and with it much of this book) would 
tremble. So far there has been no evidence for any of this, and as long as physics 
remains an empirical science I offer this book to the reader both as an introduction 
to modem mathematical methods in physics (in so far as these are relevant to foun¬ 
dational questions), and also as an alternative to various interpretations of quantum 
mechanics that seem to philosophize the physics of the problems away. 


Notes 

Each chapter is followed by a section called Notes, in which background and credits 
for the results in the given chapter are given. Such information is therefore absent in 
the main text (expect when—typically famous—theorems are named after their dis¬ 
coverers, like Gleason, Wigner, and the like). This Introduction, which anomalously 
contains some references, is an exception, but we still provide some notes to it. 

Since this book is not an exegesis of Bohr but rather an exposition of some math¬ 
ematical ideas partly inspired by his work (with no claim to retroactive endorsement 
by Bohr or his followers), we hardly relied on the secondary literature on his phi¬ 
losophy, except, as already mentioned, on Scheibe (1973) and Beller (1999), both 
of which are pretty critical of Bohr. Eor a more balanced picture, one might consult 
monographs like Eolse (1985), Murdoch (1987), McEvoy (2001), Brock (2003), the 
collection of essays edited by Eaye & Eolse (2017), as well as Dieks (2016a) and 
Zinkernagel (2016). Secondary literature on Heisenberg’s philosophy of physics is 
scarce, but includes Camilleri (2009b). Though irrelevant to the present book, one 
cannot resist mentioning Landsman (2002) on Heisenberg’s controversial political 
war record, from which he tried to escape by writing the intriguing essay Ordnung 
der Wirklichkeit, published 50 years later as Heisenberg (1994). 

A propos, notes on von Neumann and operator algebras follow §C.25. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 
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Introduction 


Strictly speaking, no previous knowledge of quantum mechanics is needed to un¬ 
derstand this book, but it is hard to imagine readers of this book without such a back¬ 
ground. Beyond standard undergraduate physics courses, for mathematically seri¬ 
ous introductions to quantum mechanics—further to von Neumann (1932), which 
founded the subject—we recommend Bongaarts (2015), Gustafson & Sigal (2003), 
Hall (2013), Takhtajan (2008), and Thirring (2002). No previous acquaintance with 
the philosophy of quantum theory is required either, but once again it might be 
expected that typical readers of the present book have at least some awareness of 
this field. In fact, the author himself has only read a few such books from cover to 
cover, including Heisenberg (1958), Jammer (1966, 1974), Scheibe (1973), Barman 
(1986), van Fraassen (1991), Bub (1997), Beller (1999), and Wallace (2012). 

From these books, apart from its obvious source Heisenberg (1958), Bohrifi- 
cation (at least in its ‘exact’ variant) is conceptually akin to the program of Bub 
(1997), which was based on Clifton & Bub (1996); the past tense seems appropri¬ 
ate here, since Bub has meanwhile abandoned this program in favour of foundations 
based on information theory (Bub, 2004). Anyway, given some preferred observable 
a € B(H)sa and pure state e G (H) (i.e., a one-dimensional projection on H), the 
Bub-Clifton approach looks for the largest C*-subalgebra A of B{H) on which one 
may define something like a hidden variable compatible with the Born probabili¬ 
ties emanating from the given state e (the emphasis on some given e comes form 
the modal interpretation(s) of quantum mechanics). For generic states e and observ¬ 
ables a, this typically allows A to be noncommutative, which blasts the conceptual 
framework of exact Bohrification. Requiring compatibility with quantum mechanics 
for arbitrary states e, on the other hand, would force A to be commutative. All this 
relates to the Kochen-Specker Theorem; see the Notes to §6.1 for further details. 

Finally, though remote from Wallace (2012) in our attempt to solve (or, in the 
light of the first quotation below, one should say “address”) the measurement prob¬ 
lem through physics rather than philosophy, even with this polar opposite author we 
share the following attitude towards the foundations of quantum mechanics: 

‘The basic thesis of this book is that there is no quantum measurement problem (...) What 
I mean is that there is actually no conflict between the dynamics and ontology of (unitary) 
quantum theory and our empirical observations. (...) [I do not] wish to be read as offering 
yet one more “interpretation of quantum mechanics”. 

This book takes an extremely conservative approach to quantum mechanics (...) quantum 
mechanics can be taken literally (...) there is just unitary quantum mechanics. 

The way in which cats or tables exist is as structures within the underlying microphysics 
(...) [they are] emergent objects, higher-order entities.’ (Wallace, 2012, pp. 1, 2, 13, 38, 40) 

But although it may indeed apply to the town of Oxford, one might take issue with: 

‘It is simply false that there are alternative explanatory theories to Everett-interpreted quan¬ 
tum mechanics which can reproduce the predictions of quantum theory (...) The Everett 
interpretation is the only game in town.’ (Wallace, 2012, p. 43) 
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Chapter 1 

Classical physics on a finite phase space 


Throughout this chapter, X is definite set, playing the role of the configuration space 
of some physical system, or, equivalently (as we shall see), of its pure state space (in 
the continuous case, X will be the phase space rather than the configuration space). 
One should not frown upon finite sets: for example, the configuration space of N 
bits is given hy X = 2—, where for arbitrary sets Y and Z, the set consists of all 
functions x'.Z^Y, and for any G N we write iV = {1,2,..., A^} (although, fol¬ 
lowing the computer scientists, 2 usually denotes {0,1}). More generally, if one has 
a lattice A dl/ and each site is the home of some classical object (say a “spin”) that 
may assume N different configurations, then X = iV'^, in that x : A ^ describes 
the configuration in which the “spin” at site n G A takes the value x(n) G (V. 

Although the setting is a priori deterministic, in that (knowing) some point x G 
X in its guise as a pure state at least in principle determines everything (there is 
to say), the mathematical language will be probabilistic. Even within the confines 
of classicality this allows one to do statistical physics, and as such it also sheds 
light on e.g. the special status of x as an extreme probability measure (see below). 
Furthermore, the use of this language may be motivated by the goal of describing 
classical and quantum mechanics as analogously as possible at this elementary level. 

The following concepts play a central role in this chapter. Recall that the power 
set ,i^(X) ofX is the set of all subsets of X (for finite X, these are all measurable). 

Definition 1.1. 1. An event is a subset U QX, i.e., U G i3^(X). 

2. A probability distribution on X is a function p:X ^ [0,1] such that = 1- 

3. A probability measure on X is a function P: i^{X) —>■ [0,1] such that P{X) = 1 
and P{U uy) = P{U) +P{V) whenever U dV = 0. 

4. For a given probability measure P on X, and an event V dX such that PiV) > 0, 
the conditional probability P{U\V) ofU given V is defined by 

5. A random variable on X is a function /: A —>■ K. 

6. The spectrum of a random variable f is the subset cy{f) = {f{x) j x G 2f} o/ K. 
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1 Classical physics on a finite phase space 


1.1 Basic constructions of probability theory 

Probability distributions p and probability measures P determine each other by 

p{u) = Y. ( 1 - 2 ) 

xeu 

p{x)=P{{x}), (1.3) 

but this is peculiar to finite sets (in general, probability measures will be primary). 
Two special classes of probability measures and of random variables stand out: 

• Each y G X defines a probability distribution py by Py{x) = 5xy, or explicitly 
Py{x) = \ X = y and Py{x) = 0 if x ^ y; for the corresponding probability 
measure one has Py{U) = 1 if y G t/ and Py{U) = 0 if y ^ U . 

• Each event U CX defines a random variable ly (i.e., the characteristic function 

of U) by lc/(x) = 1 if X G t/ and \u{x) = 0if x . Clearly, (^(If/) = {0} when 

t/ = 0, <t(1j/) = {1} when U =X, and (y{lu) = {0,1} otherwise. Note that 
lc/(x) = Px{U). Conversely, any random variable / with spectrum (j(/) C {0,1} 
is given by / = ly for some t/ C X; just take U = {x G X \ f{x) = 1}. Such 
functions may be construed as yes-no questions to the system (i.e. / = 1 versus 
/ = 0) and will lie at the basis of the logical interpretation of the theory (cf. §1.4). 

The single most important construction in probability theory is as follows. 

Theorem 1.2. A probability distribution p on X and a random variable / : X —>■ R 
jointly yield a probability distribution pf on the spectrum (j(/) by means of 

Pfi^)= E 

xex\f{x)=l 

In terms of the corresponding probability measure P on X, one has 

Pf{X)=P{f = X), (1.5) 

where f = X denotes the event {x G X \ f{x) = X} in X. Similarly, the probability 
measure Pf on (j(/) corresponding to the probability distribution pf is given by 

Pf{A)=P{fGA), (1.6) 

where A C cy{f) and f G A denotes the event {x GX \ f(x) G Aj in X. 

The proof is trivial. Instead of f = X, the notation might be used, and 

similarly, (4) is the same as f G A. If X G <j{f ) is non-degenerate in that there 
is exactly one x^ GX such that f{xi) = X, then one simply has P{f = X) = p{xx)- 
Eor example, combining both our special cases P — Py and f —\u above yields 

Py{\u = l) = 1 andPj.(lj/=0)=0 ifyGt/; (1.7) 

P 3 ,(lj; = l) = 0 andPj,(lj/=0) = l ifyiU. (1.8) 
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Given some probability measure P, the expectation value Ep{f) and the variance 
Ap{f) of a random variable / with respect to P are defined by, respectively, 

Ep{f)=Y^f{x)p{x)- (1.9) 

Ap{f)=Ep{f)-Ep{ff. (1.10) 

A simple calculation shows that Ep may be written directly in terms of P itself as 

Ep{f)= E = (1-11) 

Xea{f) 

Note that Ap{f) > 0. The special role of the point measures Py may now be clarified: 

Proposition 1.3. A probability measure P takes the form P = Pyfor some y & X iff 
Ap{f) = Ofor all random variables / : A —> M. 

Proof For “=i>”, we compute Ep^{f) = f{y), and hence Ep^{f^) = f(y)^. In the 
opposite direction, take / = py, so that f^ = f and hence Ap{f) = p{y) — p{y)^- 
The assumption Ap{f) = 0 for each / implies that either p{y) = 0 or p{y) = 1 for 
each y GX. Definition 1.1.2 then implies that p{y) = 1 for exactly one y GX. □ 

More generally, a collection /i of n random variables and a (single) prob¬ 
ability distribution p on A jointly define a probability distribution on the 

product (j(/i) X • • • X (7 (/h) of the individual spectra by 

Pfi...f„ih,---,K)= E fW- (1-12) 

xeX\fi{x)=li,...,fn{x)=Xn 

Once again, this may be rewritten as 

Pfl...fni^u---,^n) =P{f\ =h,---Jn =K), (1T3) 

where the argument of P denotes the intersection {fk = Xf), i.e., 

P{f\ =X\,...Jn=Xn) = {xG A I fi(x)=Xu...Jn{x)=Xn}. (1.14) 

Simple calculations then yield results for the so-called marginal distributions, like 
E P{fl=Xu...,fn=Xn)=P{fl=Xl,...,f=Xi), (1.15) 

where I < I < n. The above constructions also apply to the corresponding condi¬ 
tional probabilities: given m additional random variables ai ,..., am, one has 


E ^(/i 

= Xi,. 

• • ifn — 1^1 — Oil) ■ 

■ ■ — ^m) 

(1.16) 

= p{h 

= Ai,., 

■•jfl =^lWl = Oil,.. 

■ ) ■ 

(1.17) 
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1.2 Classical observables and states 

Given a finite set X, we may form the set C{X) of all complex-valued functions on 
X, enriched with the structure of a complex vector space under pointwise operations: 

(A-/)(x) =A/(x) (AgC); (1.18) 

{f + g)ix) = fix) +gix). (1.19) 

We use the notation C(X) with some foresight, anticipating the case where X is no 
longer finite, but in any case, since for the moment it is, every function is contin¬ 
uous. Moreover, the vector space structure on C{X) may be extended to that of a 
commutative algebra (where, by convention, all our algebras are associative and are 
defined over the complex scalars) by defining multiplication pointwisely, too: 

if-g)ix)=fix)gix). (1.20) 

Note that this algebra has a unit lx, i.e., the function identically equal to 1. 

For finite X, this structure suffices for X to be recovered from C(X), as follows. 

Definition 1.4. The Gelfand spectrum 2^(A) of a (complex) algebra A is the set of 
all nonzero linear maps ft): A —>■ C that satisfy Co{fg) = ft)(/)ft)(g). 

These are, of course, precisely the nonzero algebra homomorphisms from A to C. 

Proposition 1.5. The Gelfand spectrum Z(C(2f)) is isomorphic (as a set) to X. 

Proof Each x & X defines a map ft);^: C(2f) —>■ C by (Ox{f) = fix). One obviously 
has (OxG X (C(2f)), so we have a map —>■ E(C(2f)), x i—>■ (Ox- We show that this map 
is a bijection. Injectivity is easy: if (Ox = (Oy, then f{x) = /(y) for each / G C(2f), 
so taking f = 5^ for each zGX gives x = y (here 5,ix) = Sxz). To prove surjectivity, 
we note that since C(X) is finite-dimensional as a vector space, with basis (Syjygx, 
each linear functional co : C(2f) takes the form 

for some function piX ^C. For ft) G E (C(X)), find some zGX for which /r (z) 0 
(this has to exist, as ft) 0). For arbitrary w G X, imposing (oi5w5f) = ft)(5w.)ft)(5j) 
enforces ft = 5^ (which also shows that z is unique), and hence co = (O^. □ 

The physically relevant set RiX) of all real-valued functions on X is obviously 
a real vector space inside C(X). To recover it algebraically, we equip C(X) with an 
involution, which on an arbitrary (not necessarily commutative) algebra A is defined 
as an anti-linear anti-homomorphism that squares to id^i, i.e., a linear map * : A —A 
(written a i—>■ a*) that satisfies (Xa)* = Xa*, iab)* = b*a*, and a** = a. In our case 
A = C(X), which is commutative, the latter property simply becomes ifg)* = f*g* ■ 
In any case, we define this involution by pointwise complex conjugation, i.e., 

f*{x)=fix). (1.22) 
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We evidently recover the real-valued functions in the involutive algebra C{X) as 

R(X) = C{X),, = {f&C{X)\r=f}. (1.23) 

Finally, although we do not need this yet, we note that C{X) has a natural norm 

||/||oc = sup{|/(x)|}. (1.24) 

xex 


These structures turn C{X) into a commutative C*-algebra (cf. Definition C.l). 

Definition 1.6. The algebra of observables of the physical system described by the 
phase space X is C{X), seen as a (commutative) C*-algebra in the above way. 

Thence elements of C(2f) are called observables (a term that really should be applied 
only to its self-adjoint elements, i.e., those satisfying f* = /). 

We have thus equipped the random variables on X with enough structure to re¬ 
cover X itself, and now turn to the other side of the coin, viz. the probability mea¬ 
sures on X. Here the relevant mathematical structure is that of a compact convex set, 
a concept we only need to define in the context of an ambient (real) vector space. 

Definition 1.7. A subset K of a (real or complex) vector space V is called convex if 
the straight line segment between any two points on K lies in K. Expressed formally, 
this means that whenever v,w £ K and t C (0,1), one has fv -f (1 — t)w G K. 

The following probabilistic reformulation of this notion is very useful. 

Proposition 1.8. A set K dV is convex iff for any k, given k probabilities {t\,. ■ ■ ,4) 
( i.e., ti > 0 and =1) <^nd kpoints (vj,..., vf) in K, one has Y^i=\ h' G K. 

Proof. Taking k = 2 recovers Definition 1.7 from its probabilistic version. Con¬ 
versely, one uses induction on k, using the identity (assuming 0 < 4 < 1 ): 


4vi H-h4vi: = (1 -4) 



4-1 

1 4 


V/t-i 


+ 4r/t. 


□ 


Any linear subspace of V is trivially convex, as is any translate thereof (i.e., any 
affine subspace of V). Another, much more important example is the convex hull 
co{S) of any subset 5 C V; noting that the intersection of any family of convex sets 
is again convex, co{S) may be defined as the intersection of all convex subsets of V 
that contain S, or, equivalently, as the smallest convex subset of V that contains S 
(whose existence is guaranteed by the previous remark). Proposition 1.8 then yields 


CO 


{S) = \Y^ ti • vH k G N, (vi,..., V,) G > 0,Y^ti = 1 I. (1.25) 


. !=1 


In particular, if 5 = {vi,..., v^} is a finite set, then one simply has 


. . . ,VA:}) = I J^^ti ■ Vi I ti > 0,Y^ti = 1 I . 


(1.26) 
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The convex hull of any finite set of points in is called a convex poly tope. Such 
convex sets are closed and bounded (since none of the f, > 0 can walk away too far 
without violating the condition = 1), and hence are compact. In particular, 

Zi„ = {xG]R"+i |x;>0,^x, = l} (1.27) 

i 

is a convex polytope called a simplex. For example, Ai is the line segment from 
(0,1) to (1,0) in We would like to say that is “isomorphic” to the unit interval 
[0,1], so we define two convex sets Ki,K 2 to be isomorphic (as such) if there is a 
bijection f : Ki ^ K 2 that is affine, in that for t G (0,1) and vi, V 2 G Ki, we have 

/(fvi + (1 -f)v 2 ) =f/(vi) + (l -f)/(v 2 ). (1.28) 

Then the function /; Ai —>■ [0,1] given by f{X, 1—X)=X, where X G [0,1], will do. 
Similarly, A 2 C is isomorphic to any equilateral triangle in with sides of unit 
length, whereas A^ is just the tetrahedron (which is one of the five Platonic solids). 

There are many other convex polytopes (cf. §B.ll), but simplices are of prime 
importance for us, since is isomorphic to the set Pr(2f) of all probability distribu¬ 
tions on a set X = {0,..., n} with n + 1 points; the identification Pr(2f ) ^ p gg x G An 
is given by x, = p(/ + 1). In particular, we see that for any finite set X, Pr(2f) is a 
compact convex set. This is also clear from Definitions 1.1 and 1.7 (and will even 
be true for general compact phase spaces X, cf. Corollary B.17 and §C.25). 

Definition 1.9. The state space of the physical system described by a (finite) space 
X is the set Pr(7f) of all probability measures onX (or, equivalently, of all probability 
distributions on X), seen as a compact convex set. 

Thus a probability measure (or distribution) on X is often called a state (of the 
physical system described by X). The operation of passing from states P,Q G Pr(7f) 
to a new state tP+ (1 — t)Q G Pr(X), where t G (0,1) as usual, or, more generally, 
from a (finite) family of states (P;) and a set (f,) of probabilities (i.e., f,- > 0 and 
= 1) to the convex sum f^fiPi, is called mixing. 

It is possible to recover X from its associated state space Pr(7f), as follows. 

Definition 1.10. The (extreme) boundary dgK of a convex set K consists of all 
points V G K satisfying the following condition: 

ifv = tw-\-{l— t)xfor certain w,x G K and t G (0,1), then v = w = x. 

Elements v G d^K of the boundary are called extreme points ofK. 

We will now compute the boundary of Pr(7f). The result may be expressed by 

deAfi = .. . .^n+lf (1.29) 

where (ei,_e„+i) is the standard basis of (i.e., ei = (1,0,... ,0), etc.). How¬ 

ever, we will give a direct probabilistic proof. We already noted the special proba¬ 
bility measures Px, x GX. The association xi-G Px defines a map from X to Pr(2f). 
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Proposition 1.11. The setX is isomorphic to the boundary (9ePr(X) through x i—>■ Px- 

Proof. It is convenient to work with probability distributions p rather than prob¬ 
ability measures P. First, px is trivially injective from X to Pr(X): \f x 
then Pxix) = 1 whereas Py{x) = 0, so px ^ Py Second, px G dgPr{X). For sup¬ 
pose one has Px = tp + {\ — t)q for some p,q G Pr(X) and t G (0,1). Hence 
Px(y) = tp{y) + (1 - t)q{y). Taking y^x yields p{y) = q{y) = 0, so that p=q = px. 
Consequently, X C dePx{X). 

The converse inclusion is (contrapositively) equivalent to the property that for 
any pfpx (for allx), there are q and r, qfr, and t G (0,1), with p = tq+(l —t)r. 
Indeed, if p f Px, there is some xq G X with 0 < p{xo) < 1. Now define q, r, and t 
by q{xo) = 1 and q{x)=0 for allx ^xo, r{xo) = 0 and r{x) = p{x)/{l —p{xo)), and 
finally t = p(xo). Then p = tq+ (1 —t)r and qfr. □ 

The simplest example would be X = {0,1}, so that Pr(X) = [0,1] by mapping the 
distribution p G Pr(X) to p{l). Since one may directly verify that (9e[0,1] = {0,1}, 
under the above isomorphism one therefore has dePr{X) = {0,1}. Analogously, 
(9g(0,1) = 0, so that the boundary of a convex set may apparently be empty. Hence 
we see that one remarkable ingredient of Proposition 1.11 lies in the claim that the 
convex set Pr(A) actually has a (nonempty) boundary! This is no accident: by the 
Krein-Milman Theorem (cf. §B.10), this is true for any compact convex set (which 
is consistent with the counterexample just given). For example in quantum mechan¬ 
ics we will encounter the case ofK — B^ (i.e. the closed unit ball in M^) as the state 
space of a qubit, whose (extreme) boundary is the two-sphere 5^, cf. Proposition 
2.9. Something similar is true in any dimension, but beware of surprises: if K — A 2 
is an equilateral triangle in the plane, then its extreme boundary dgK consists of the 
vertices of K (whereas its/aces form the geometric boundary of the triangle). 

The general problem arises whether some point v G of a compact convex set K 
may be written as a convex sum (or, more generally, an integral) of extreme points 
of K, and if so, to what extent this extremal decomposition 

V = U > 0, Yjti = 1 , V,- G deK, (1.30) 

i 

which for simplicity has been assumed to be a finite sum here, is unique. Without 
proof, we state a general result of convexity theory, called Caratheodory’s Theorem: 

Theorem 1.12. IfK is a nonempty compact convex subset o/M", then d^K / 0, and 
each point ofK is a convex sum of at most n -f 1 points in d^K. 

If IG = A„, then this sum generically has n-f 1 points and is unique. Probabilistically: 

Proposition 1.13. IfX is finite, then any probability measure P G Pr(A) may be 
written in a unique way as a finite mixture of extreme probability measures, viz. 

P=Y^txPx. (1.31) 
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Proof. Take tx = P({x}) in the sense of Definition 1.1, or, equivalently, tx = Ep{5x) 
in the sense of (1.9). To see that this decomposition is unique, use Proposition 1.11, 
i.e. dePx{X) = X, in (1.30) to force I — X and apply both sides of (1.31) to 5x. □ 

The state space and the algebra of observables may also be defined in terms of 
each other. We start with the (re)construction of states from observables, where the 
following definition and proposition may leave a hybrid impression. The rationale 
behind our approach is that for many purposes it is easier to work with the com¬ 
plex algebra C{X), but on the other hand, compact convex sets are most naturally 
defined in terms of real vector spaces. Fortunately, it is easy to switch between the 
two: we already know how to obtain the real part R{X) from C{X), see (1.23), and 
conversely, C{X) is simply the complexification of the real vector space R{X). 

Definition 1.14. A state on C{X) is a linear map O): C(3f) —>■ C that satisfies: 

1. (o{f^) > 0 for each f £C{X) with f*=f (positivity); 

2. co(lx) = 1 (normalization). 

The first condition obviously comes down to fi)(/) > 0 whenever / > 0 pointwise. 

Equivalently, we may define a state on /?(X) as a real-linear map (% : K 

that satisfies the very same conditions. Indeed, a state (% on R{X) defines a 
complex-linear map co : C{X) —>• C by (o{f-\-ig) = (%(/) -l-/(%(g), where f,g G 
R(X). This map satisfies the same conditions of positivity and normalization. Con¬ 
versely, a> may be restricted to the real part R(X) of C(X), so that there is no real 
(sic) difference between co and cOr. Hence we will use these interchangeably, often 
even dropping the suffix K on co. One advantage of this ability to switch is that a 
state CO on C{X) may be regarded as an element of the real vector space R(X)*. 
Doing so shows that the terminology of Definitions 1.9 and 1.14 is consistent: 

Theorem 1.15. There is a bijective correspondence between states CO on C{X) and 
probability measures P on X, given by CO GG Ep, cf. (1.9) and (1.11). Therefore, as 
a subset of the (real) vector space R{X)* of all (real-) linear maps from R(X) to R, 
the set S{C{X)) of all states on C(X) coincides with the set Pr(2f) of all probability 
measures on X. In particular, the state space S{C{X)) of C{X) is a compact convex 
set in R(X)* (as a finite-dimensional vector space with its usual topology). 

Proof. Given a state co, define a function p : X —R by p(x) = 0(5^). Since 5^ > 0 
pointwise, positivity of co yields p{x) > 0. Noting that lx = normaliza¬ 

tion then forces Y.xPi^) = so that p is a probability distribution on X. Hence 
P G Pr(3f), where P is the probability measure corresponding to p. Conversely, 
P G Pr(X) defines a map Ep : R(X) — R by (1.9), which is positive and normalized. 
Note that compactness and convexity of the set 5(C(X)) in R(X)* follow directly 
from its definition, i.e., even without knowing that it equals Pr(2f). □ 

Consequently, we may refer to 5(C(X)) as the state space of C(X) without any 
ambiguity, and we will always regard state spaces of (unital) C*-algebras A (cf. 
Appendix C) as compact convex sets S{A), where in the present case A =C{X). 
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1.3 Pure states and transition probabilities 

For any C*-algebra A (with unit), and hence in particular for A =C{X), elements of 
the boundary deS{A) are called pure states, and we call 

P(A) = 5,5(A) (1.32) 

the pure state space of A. States that are not pure are called mixed. 

Theorem 1.16. One has P{C{X)) =X, in that the following map is an isomorphism: 

X^P{C{X)), x^0}^,0),{f)=f{x). (1.33) 

Proof. Combine Proposition 1.11 and Theorem 1.15. □ 

For finite X this isomorphism is merely meant as a bijection between sets (and for 

general compact Hausdorff spaces X it will be a homeomorphism of topological 
spaces), but we will now introduce some additional structure on pure state spaces 
that will enrich Theorem 1.16 to an isomorphism of so-called sets with a transition 
probability. This will be necessary in order to reconstruct the observables from the 
pure states, but it also clarifies the general probabilistic structure of physics (note 
that the following definition is unusual in probability theory!). 

Definition 1.17. 1. A transition probability on a setX is a function 

t:XxX^[0,1] (1.34) 

that satisfies T{x,y) = 1 ijf x = y and T{x,y) = T{y,x) (symmetry). 

The simplest example of a transition probability (on any set X) is obviously 

Tix,y) = 5,y. (1.35) 

The point is that this transition probability may be derived from the classical C*- 
algebra of observables C(X) by the following formula (assuming X finite): 

5., = inf{/(x) I / G CiX),0 <f< IxJiy) = 1}. (1.36) 

Indeed, for x = y this is a tautology, whereas for x fy the infimum (which is zero) 
is attained by / = 5y. In terms of the pure state space P{C{X)), which is isomorphic 
to but not equal to X, cf. Theorem 1.16, this formula may be written as 

5,, = inf{a),(/) I / G C{X),0 <f< lc(x),®v(/) = U- (1-37) 

Furthermore (and this is the real point, so that we already have to mention it here, 
ahead of a more detailed treatment in the context of quantum mechanics), the right- 
hand side of (1.37) may be generalized to any finite-dimensional C*-algebra A by 

T^(a),a)') = inf{a)(fl) | a G A,0 < a < lA,ft)'(fl) = 1}, (1.38) 
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where (0,Co' € P{A). Since (1.38) clearly generalizes (1.37), for A = C{X) we have 

cOy) = 5xy. (1.39) 

Note that the symmetry property in Definition 1.17 is not obvious from (1.38), but 
in the classical case A = C(C) it is true by computation, and the same will hold in 
quantum theory. To motivate these definitions, we recall that / in (1.37), and like¬ 
wise a in (1.38), are yes-no question to the system, so that the transition probability 
(o') monitors to what extent the states (O and (o' may be sharply distinguished 
by asking such questions. If they can, there should be some question a for which 
(o' (a) = 1 and <»(«)= 0, so that 0 )') (if CD 7 ^ w', of course). As we have seen, 
in the classical case this can always be done. However, we shall see this is no longer 
the case in quantum mechanics, where pure states may be thus distinguished iff they 
correspond to orthogonal unit vectors in Hilbert space. Further motivation for the 
expression (1.38) is post hoc, as it turns out to allow a reconstruction of the vec¬ 
tor space of observables A, supplemented by the part of its algebraic structure that 
determines its logical and probabilistic structure (viz. the ability to form squares, 
a I— a^) from P{A) with its associated transition probability. See Theorem C.179. 

First, we develop some theory that puts both classical and quantum mechanics 
into a more general setting. Notwithstanding the formal incorporation of the former, 
the underlying Hilbert space thinking will be obvious throughout. 

Definition 1.18. Let {X, t) be a set with a transition probability. 

1. A subset O CX is orthonormal if t{x,y) = 5xy for all x,y G O. 

2. A basis of a setX with a transition probability x is an orthonormal family B <ZX 
such that for each x GX one has 


Y^x{x,u) = \. (1.40) 

u€B 

A basis of a subset S GX is an orthonormal family B G S such that (1.40) holds 
for each x G S. Relative to such a basis B of S, we define Ty : A —>■ K 

uGB 

As a special case, for S = {u} we write = T„, so that 

Xu{x) = x{x,u). (1-42) 

3. The orthocomplement 5^ of some subset S G X is defined as 

= {yGX\x{x,y)=0\/xGS}. (1.43) 

4. A subset S gX is orthoclosed ifS-^-^ = S (where = (5-'-)-'-). 

5. A resolution of the identity in X is a family of orthogonal orthoclosed subsets 
{Sj)j (i.e., x{xi,Xj) = 0 if Xi G Si, xj G Sj, and i f j), for which T.j'^Sj = lx- 
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6. An observable/or the pair {X, t) is a bounded function / : X —>■ K of the form 

f = '^Ci- Ty., Ci GR, yi G X. (1.44) 

i 

The real vector space of such observables is called £°°{X, t). 

7. A spectral resolution of an observable f € t°(X, t) is a decomposition 

f = (1.45) 

3. 

where is a resolution of the identity and each A € M occurs at most once. 

In the present section X is finite, whilst in the following section on quantum me¬ 
chanics on finite-dimensional Hilbert spaces at least all bases will be finite, so that 
there are no convergence issues. In general, B may be infinite, in which case (1.40) 
is defined as the least upper bound of all finite partial sums, and all sums in Defi¬ 
nition 1.18 are defined pointwise (i.e., in x). In that case, eq. (1.45) may need to be 
adapted through limit constructions. Furthermore, one may worry about the basis- 
dependence of Ts in (1-41), but fortunately it turns out that in all sets with a transi¬ 
tion probability that arise as pure state spaces defined by C*-algebras according to 
(1.38), the function Ts is independent of the basis B whenever S is orthoclosed. In 
that case, spectral resolutions exists and are unique, and one may turn the real vector 
space t) of part 6 into a Jordan algebra by defining a product o through 

= (1.46) 

X 

fog=\{{f + gf-{f-gf). (1.47) 

In the classical case this yields the pointwise product (1.20), whereas in quantum 
mechanics it recovers the anti-commutator. Both are examples of Jordan products 
(cf. §C.25), i.e., commutative products o satisfying the curious axiom (C.619). 

All this trivializes if T = is given by (1.35), where X need not even be finite: 

1. Any subset (9 C 2f is orthonormal. 

2. The set B = X itself is the only basis of {X,t), and analogously B — S. 

3. The orthocomplement 5^ is the set-theoretic complement S^' = X\S. 

4. Hence any subset S CX is orthoclosed. 

5. Any partition X = yields a resolution of the identity. 

6 . Any bounded function / : A —M is an observable, so that when X is finite, 

r(A,T) =/?(A) = C(A,R); (1.48) 

7. The spectral resolution (1.45) of / is given (analogously to operator theory) by 

/= E (1-49) 

Xea{f) 

cf. Definition 1.1.5. In particular, spectral resolutions in (1.48) are unique. 
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1.4 The logic of classical mechanics 

Whatever one’s route to C(X,]R) as the algebra of observables, i.e. either as a start¬ 
ing point or as a derived concept as in (1.48), it determines the logical structure of 
classical mechanics (we here restrict ourselves to propositional logic). According to 
the general scheme reviewed in §D.2, apart from the usual logical connectives 
A, V, and —for not, and, or, and implies, a propositional theory needs a set Ex of 
atomic propositions. These are provided by C(X,K), and Ex consist of all expres¬ 
sions f G A (we expect no confusion between this notation for both propositions in 
logic and events in probability theory), where /: X —M is a function, and A is some 
subset of M. As we shall see, f G A is always false if A D <t(/) = 0, so we might 
as well assume that A C G{f). We write f = X for / G {A}. From these elemen¬ 
tary propositions, propositions are constructed inductively using the iterative rules 
of propositional logic (see §D.2). This produces a set Bx = B^x of propositions. 

Of course, there are logical relations between our atomic propositions (and hence 
between elements of Bx). For example, if A C A', then f G A should imply / G A'. 
Such relations may be formulated as axioms of some propositional theory .S^x de¬ 
scribing the logic of classical mechanics. These axioms take the following form: 

(fGn^(gGA) ifff-\r)Cg-\A). (1.50) 

This may also be formulated through the notion of semantic entailment. For each 
X GX,we define a valuation Vx : Ex ^ {0,1} (cf. §D.2) by 

y,(/GA) = l iff/(x)GA, (1.51) 

extended to a map Vx'. Bx ^ {0,1} through the recursive use of truth tables. Defin¬ 
ing the semantic entailment relation \=x on Bx by a \=x j3 iff yt(o:) = 1 implies 
VxiP) = 1 for all xG X, it is easy to see that a j3 as defined in (1.50) iff a [=x j3. 
In order to compute the ensuing Lindenbaum algebra Lx = L^x, we note that 

(/Gr)o(gGA) iffr'(r)=g-'(A). (1.52) 

Writing for ~^ (which is the equivalence relation given by \=x, too), we find 

(/GA)~x(1/-1(4) = 1), (1-53) 

where we recall that Ia is the characteristic (or indicator) function of A. Using the 
truth tables for A and for we also obtain (in terms of the complement A"^ = K\A): 

(/ G r) A (g G A) (1 f-HDns-HA) = 1); (1-54) 

hfG A) ^x if GA^) (l/-i(4 ^) = 1). (1.55) 

Finally, the truth tables yield logical (and hence semantic) equivalences like 

ct V j3 ~x “'(“'0! A “ijS). (1.56) 
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Combining the specific and the general equivalences (1.53) - (1.56), we have: 
Lemma 1.19. Any proposition in Bx is logically (and semantically) equivalent (rel¬ 


ative to X) to one of the form ly = 1, for some event U <ZX. Furthermore, 

(-.ly = 1) (lyc = 1); (1-57) 

(ly = 1) A (ly = 1) (It/ny = 1); (1.58) 

(ly = 1)V (ly = 1) (It/uy = !)• (1-59) 

Theorem 1.20. The Lindenbaum algebra Lx is isomorphic (as a Boolean algebra) 
to the power set i^(X) ofX under the map (p : Lx ^ if^{X) induced by 

^{[f&A]x)=r\A). (1.60) 

In particular, the logical connectives A and V (descended to Lx) turn into set- 
theoretic complementation (—)'^, intersection fl, and union U, respectively, in that 

(Pihcc]x) = (pilajxY; (1.61) 

(p([aAl5]x) = (p{[a]x)n(p{[li]x; (1.62) 

9([«Vj3]x) = (p{[a]x)U(p{[P]x), (1.63) 

and (p maps the partial order < on Lx into set-theoretic inclusion C, i.e., 

[a]x < [j3]x iff(p{[a]x) c (p{[l3]x)- (1-64) 


This is immediate from Lemma 1.19. Interestingly, the Boolean algebra structure 
just derived as the governor of the (propositional) logic of classical mechanics may 
be reformulated in terms of the Jordan algebraic structure (1.46) - (1.47) of t°{Xx), 
or, when X is finite, of the C*-algebra of observables C{X) itself: 

• Events U CX (and hence, by Theorem 1.20, logical equivalence classes of propo¬ 
sitions) correspond bijectively to characteristic functions ly on X, that is, with 
yes-no questions (having spectrum in {0,1}). Algebraically, these are precisely 
the idempotents in t°{X, t), i.e., those functions e satisfying e^ = e. 

• In terms of those, the partial ordering and the logical connectives are given by 


e<fiSeof = e; (1.65) 

-<e = lx — e; (1.66) 

eAf = eof- (1.67) 

ey f = e + f-eof. (1.68) 


Indeed, in this case o is pointwise multiplication (1.20). Using Ij/ • ly = Ij/ny 
yields (1.67), (1.65) comes down toU CV iff t/nV = U, (1.66) is lx — lu = lu‘=^ 
and (1.68) follows by writing its right-hand side as lx — {lx — e) A (lx — f)- 
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1.5 The GNS-construction for C(X) 

As a bridge from classical to quantum mechanics (as well as a good exercise), we 
finally inject some Hilbert space theory into classical physics by discussing the GNS- 
construction of C*-algebra theory for the special case of C{X), where X remains 
finite. In general, for each state O) on a C*-algebra A, the GNS-construction canon¬ 
ically yields a Hilbert space Ha, (which is finite-dimensional for A = C{X) with 
finite X) and a representation of A on Ha,, in the sense of a (complex) linear map 

na,:A^B{Ha,) (1.69) 

that satisfies 

7ta,{ab) = %a>{a)%ai{b)-, (1.70) 

Ttrnia*) = %a>{a)*. (1.71) 

Furthermore, Hg, contains a special unit vector 12® that is cyclic for Ka, in that 

%ai{A.)Q,a) = {%a>{a)Q,a,, a & A} — Ha,, (1-72) 

at least in the relevant case where dim(//ft)) < oo; otherwise, the left-hand side is 
merely dense in Ha, and one needs to take the (norm) closure to obtain Ha,. Further¬ 
more, f2(o realizes the state O) as a quantum-mechanical expectation value by 

(0{a) = {i2a),tta){a)i2a))Ho,. (1-73) 

Given (O G S(A), the GNS-construction starts with the vector spaces 

No, = {aGA\co{a*a)=0}; (1.74) 

Ha, = A/ No,. (1.75) 

Now, \fbG No, and a gA, then ab G No,, because of the important inequality 

C0{b*a*ab) < \\af(0(b*b). (1.76) 

This is true for any C*-algebra A, but below we prove it only for our example. 
Assuming (1.76) for the moment, the action of A on itself by left multiplication 
descends to a well-defined action on Ha,, which we call Tta,- In other words, if ba, G 
Ha, is the image ofbGA under the canonical projection A — A/No,, then 

na,{a)ba, = {ab)a,. (1.77) 

Crucially, this vector space Ha, is equipped with a canonical inner product 

{aa„ba,)^co{a*b). (1.78) 

Indeed, this form is well defined, and is positive definite because (B is a state. 
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In general, Hio as defined by (1.75) with inner product (1.78) is merely a pre- 
Hilbert space, which needs to be completed in the associated norm, and it takes some 
effort to check that the operators defined by (1.77) are bounded. In our example, on 
the other hand, //« is finite-dimensional and hence complete. In any case, it is easy 
to verify the properties (1.70) - (1.73), whilst (1.72) holds with the unit 1 = 1h- 
We now prove (1.76) for A = C{X). Fom Theorem 1.15 we have (O — Ep, and by 
(1.9) and (1.24), the inequality (1.76) comes down to the obviously correct result 

X X 

Writing Npp = Np, we may also check directly that \f g & Np and / G C(X), then 
fg G Np. Indeed, in terms of the set supp (P) C X defined by 

supp(P) = {x G X I p(x) > 0}, (1.80) 

we have 

iV/. = {/GC(X) |/(x)=0VxGsupp(P)}, (1.81) 

and clearly g = 0 on supp (P) implies /g = 0 on supp (P). We now compute Hp and 
Tip. From (1.81) we have f — g G Np and hence / ~ g iff f{x) = g(x) for all x G 
supp(P), where ~ is the equivalence relation whose equivalence classes fp define 
elements of Hp = C{X)/Np. Hence fp is simply the restriction of / to supp (P), and 

Hp = f-{X,P) (1.82) 


is the Hilbert space that consists of these restriction, with inner product 

{fp,8p) = E P{x)f{x)g{x). (1.83) 

x€supp {P) 

The representation (1.77) then trivially gives 

^p {f)gp= fpgp ) (1-84) 

so that 7:p{f) is the multiplication operator defined by / on £^{X,P). In functional 
analysis one often denotes elements gp G £^(X,P) by the functions g themselves, 
and similarly writes 7tp{f) as /, so that (1.84) simply reads Ttp{f)g = fg. 

The operator norm of Ttp{f) is easily computed to be 

\\Mf)\\ =sup{|/(x)|,xGsupp(P)} = ||/j,upp(p)||oo. (1.85) 

Indeed, the bound ||7r/>(/)|| < ||/|supp(/>)ll~ immediate from the definition 

ll^7>(/)ll = sup{||;rp(/)g/.||,g/. G//p,||g/>|| = 1}, (1.86) 

and equality in this bound follows from applying the operator 7tp{f) to the function 
g = 1(7, where U CX is any set where |/| attains its maximum ||/|supp(/>)ll~- 
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Notes 

§1.1. Basic constructions of probability theory 
§ 1.2. Classical observables and states 

For (advanced) treatments of convexity theory and probability theory in contexts 
relevant to mathematical physics we recommend Israel (1979), Alfsen & Shultz 
(2001), and Simon (2001). 

§1.3. Pure states and transition probabilities 

Transition probabilities (in the abstract sense meant here) were introduced by von 
Neumann, but his manuscript from 1937 was only published in 1981 as von Neu¬ 
mann (1981/1937). This remarkable paper has remained largely unused (or even un¬ 
known) in both mathematical physics and operator algebras; Mielnik (1968), Shultz 
(1982), and Landsman (1996, 1997) are exceptions. An extensive discussion with 
further references may be found in Landsman (1998a). 

§ 1.4. The logic of classical mechanics 

Unless one counts Boole (1847), it seems that the logical analysis of classical 
mechanics was initiated by the famous paper of Birkhoff & von Neumann (1936), 
which was primarily concerned with quantum logic (cf. §2.10). Our use of semantic 
implication (also in the quantum case) was inspired by Redei (1998). 

§1.5. The GNS -construction for C{X) 

See §C.12 for the GNS-construction in general. 





Chapter 2 

Quantum mechanics on a finite-dimensional 
Hilbert space 


The quantum analogue of a finite set X (in its role as a configuration space in clas¬ 
sical mechanics) is the finite-dimensional Hilbert space £^{X), by which we mean 
the vector space of functions \j/ :X -^C, equipped with the inner product 

{V,(p)=Y.V{x)(^{x). ( 2 . 1 ) 

xex 

There is no issue of convergence here, but later on we will use the same notation 
for infinite sets X, where £^{X) is restricted to those functions (i.e. sequences) for 
which Y.x€X I'/WP < °° (which also guarantees convergence of the sum in (2.1)). 
If X = n as sets (i.e., |2f | = n), we have a unitary isomorphism of Hilbert spaces 

( 2 . 2 ) 

through the map \j/ (y/(l),..., <//(«)), where C” has the standard inner product. 

(w,z) = T.iWiZi- In particular, the function 5^ G defined by 5^(1) = 5ki, is 

mapped to the A:’th standard basis vector = \k) of C", i.e., mi = (1,0,... ,0), etc. 
In the special case X = considered in Chapter 1, we have |2f | = A^l^l and hence 

^ ^ (g)C = (g)C^, (2.3) 

neA A 

where for each n G A, so that the suffix n merely labels which copy of 

is meant (see §C.13 for tensor products of Hilbert spaces). Explicitly, a canonical 
unitary isomorphism f'{^) —^ is given by linear extension of the map 

5x I—(8>n<EA M;c(n), (2.4) 

where x: A —(V and hence Ux{n) G C^. Thus elements of the tensor product 
may be seen as wave-functions on spin configuration space (and vice versa). In par¬ 
ticular, elementary tensor products of basis vectors in correspond to wave- 

functions in that are 5-peaked at some ‘classical’ spin configuration. 
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2 Quantum mechanics on a finite-dimensional Hilbert space 


2.1 Quantum probability theory and the Born rule 

In preparation for this chapter, the reader would do well to review Appendix A. 

The probabilistic setting of quantum mechanics is given by the following coun¬ 
terpart of Definition 1.1 (from which conditional probabilities are lacking, though). 

Definition 2.1. Let H be a finite-dimensional Hilbert space. 

1. A (quantum) event is a linear subspace L ofH (which is automatically closed). 

2. A (quantum) probability distribution is a density operator, i.e., a positive 
operator p on H (in that {\j/,p\j/) > Ofor all \j/ G H) such that 

Tr(p) = l. (2.5) 

We denote the set of all density operators on H by S!(H). 

3. A (quantum) random variable is a self-adjoint operator a on H (i.e., a* =a). 

4. The spectrum of a self-adjoint operator a is the set (7(a) C K of its eigenvalues. 

Being positive, a density matrix p is self-adjoint, so by Theorem A. 10, notably 
(A.40), and Definition 2.1.2 we have 


p = Pi > Hp‘ = 

i i 

where the (n,) form an orthonormal set in H and |t;,)(t),j is the (orthogonal) pro¬ 
jection on the one-dimensional subspace C • n,. As in the classical case, one special 
class of density operators and one special class of random variables stand out: 

• Each unit vector xj/ G H defines a density operator 


Py, = e^= |V/)(r|, (2.7) 

i.e., the (orthogonal) projection on the one-dimensional subspace C • y/. A 
basis (which by convention always means an orthonormal basis) of eigenvectors 
of p^r consists of Ui = y/ itself, supplemented by any basis {V 2 ,. ■ ■ ,Udim(//)) of 
the orthogonal complement ofC-xj/. The corresponding probabilities in (2.6) are 
evidently pi = 1 and p, = 0 for all / > 1. 

• Each quantum event L <G H defines the corresponding projection e^ (which is 
self-adjoint, i.e. a random variable): If (vj) is a basis of L, then ei = Ly 
If L — H then e/. = 1 with (7(ei) = {1}. If L = {0} then e/. = 0 with G{eL) = {0}. 
In all other cases, i.e. for proper subspaces L, one has <j{eL) = {0,1}. 
Conversely, any self-adjoint operator a with spectrum a{a) C {0,1} is given by 
a = ei for some subspace LCH; just take L = {xj/ G H \ a\f/ = 1}. Such operators 
correspond to yes-no questions to the system and lie at the basis of the logical 
interpretation of quantum theory due to Birkhoff and von Neumann; see §2.10. 

The following quantum analogue of Theorem 1.2 is based on Theorem A. 10. 
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Theorem 2.2. A density operator p on H and a self-adjoint operator a : H H 
jointly yield a probability distribution pa on the spectrum C7(a) by the Bom rule 

Pa{X)=Tv{pex). (2.8) 

The associated probability measure Pa is given at A f (7(a) by (cf. (A.42)) 

Pa{A)=Tv{peA). (2.9) 

Proof. Positivity of the numbers Pa{^) follows by taking the trace over a basis of 
eigenvectors v, of p, with corresponding eigenvalues Pi > 0. This yields 

Tr(peA) 

i 

Eqs. (A.38) and (2.5) then give fj,xPa{^) = 1- Eq. (2.8) follows from the equality 
Pa{A)= LxeAPaiA), Cf. (1.2), and (A.42). □ 

In particular, if p = Py,, writing pa for the associated probability, (2.8) yields 

pI{^) = {V,exV) = \\eivf- (2.10) 

If in addition X G G(a) is non-degenerate, so that I for some unit vector 

Vx with aVx = Xvx, then the Born rule (2.9) assumes its original form 

pn^) = \{¥,Vx)\^- ( 2 . 11 ) 

Specializing (2.10) to the random variable a = ei defined by an event Ld H yields 

plil) = \\eLVff. ( 2 . 12 ) 

If L = C • ^ is one-dimensional, too, in which case we write pf^ = p'^, we have 

p^(i) = l(r,<p)l"; (2.13) 

note the following equality of probability distributions on (y{e^) = O'(e^) = {0,1}: 

P^(l)=piP(l). (2.14) 

Expectation values and variances may be defined as in the classical case, viz. 

E’p(fl) = Tr(pa); (2.15) 

Ap{a) = Ep{a^)-Ep{af. (2.16) 

Similar to (1.11), we may also write the expectation value as 

Ep{a)= Y. (2.17) 

Xeo{a) 
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2 Quantum mechanics on a finite-dimensional Hilbert space 


The special case p = px^/, for which we write Ep^ = Exf/, gives the usual formula 

Ex;,{a) =Ti{pxf,a) = (2.18) 

As in the classical case one always has Ap{a) > 0, but a major contrast between 
classical and quantum mechanics lies in the following result, cf. Proposition 1.3. 

Proposition 2.3. For each density operator p there exists a self-adjoint operator b 
such that Ap [b] > 0. On the other hand, if a* = a, then Ap (a) = 0 iff the image of p 
lies in some fixed eigenspace of a, i.e., in terms of the spectral decomposition (2.6) 
we have aVi = XVi where X is independent ofi. 

Proof We first prove the first claim for // = C^. By an appropriate choice of basis, 
we may assume that p is diagonal, i.e., p — diag(pi,p 2)5 with pi,P 2 G [0,1] and 
Pi +P 2 = 1. Now take b = Gx (i.e., the first Pauli matrix), so that Ti (pb) — 0 and 
Ti (pb^) = 1. Hence Ap{b) = 1. Secondly, for general El = C”, diagonalize p and 
order the eigenvectors such that the above 2x2 case forms the upper left block, with 
at least one of the eigenvalues p\,p 2 strictly positive. Take b to be Ox in the upper 
left corner, and zero elsewhere. This once again yields Ap{b) — 

For the second claim we use (2.6), and write p; = p^.. We note the inequality 

Ap{a)>Y,piApfa), (2.19) 

i 

with equality iff p,(a) = Pj{a) for all /,y; this follows from convexity of the function 
X x^. We now show that for any unit vector \j/ we have Ap^ = 0 iff axj/ = Xxj/. 
Assuming the latter gives Exi/{a) = (v/,av/) = X and likewise Exi/{a^) = X'^, hence 
Apx^ (a) = 0. In the opposite direction, using a* = a, elementary manipulations yield 

2ipv,(a) = ll(a-(r,ar))V^)f- (2.20) 

This clearly vanishes iff aij/ = (y/,aii/}ii/, so aij/ = Xy/, with X = {yr,ayf). 

Putting \j/ — Vi gives Ap. = 0 iff at), = A, D,, and then A^^.p.p. (a) = 0 iff in addition 
Pi(a) = Pj{a) for all i,j. Since p,(a) = {Vi,aVi) = Xt, we obtain Xt = Xj. □ 

As first recognized by von Neumann, Theorem 2.2 may be generalized to a fam¬ 
ily of self-adjoint operators as long as they commute. Thus we obtain the following 
counterpart of (1.12) - (1.13): a collection ai,...,a„ of n commuting self-adjoint 
operators and a (single) density operator p on //jointly define a probability distri¬ 
bution pai....,a„ on the product cy{a\) x • • • x (y{an) of the individual spectra by 

Pax (Ai,..., A„) = Tr (p 4‘^) • • • e (2.21) 

The proof of positivity of these numbers requires the spectral projections e^^ to com¬ 
mute, which they do provided the at commute (if the at fail to commute, positivity 
of (2.21) is not guaranteed, although they do still sum op to unity; the possibility of 
defining joint probabilities is strictly limited to commuting random variables). 
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2.2 Quantum observables and states 

Given a finite-dimensional Hilbert space H, the set B{H) of all linear operators on H 
(which for // = C" may be identified with the set M„ (C) of complex nxn matrices) 
forms an involutive algebra under the natural (pointwise) operations 

{X-a)\l/= X{a\l/)-, (2.22) 

{a + b)\j/= a\j/ + b\j/; (2.23) 

{ab)\j/ = a(b\j/), (2.24) 

and finally with a* given by the usual operator adjoint (A. 15). Compare the corre¬ 
sponding classical expressions (1.18) - (1.20) and (1.22). Analogous to (1.24), we 
also have a norm on B{H), defined by (A.18). It follows that like its classical coun¬ 
terpart C{X), the involutive algebra B{H) (or, in this case, Mn{C)) is a C*-algebra, 
cf. Definition C.l in Appendix C. It crucially differs from C{X) in that B{H) is 
non-commutative. For this reason, the Gelfand spectrum, which in the classical case 
allowed us to reconstruct X from C(X), turns out to be empty, cf. Proposition 2.10 
below. Nonetheless, it makes good sense to copy Definition 1.14, mutatis mutandis: 

Definition 2.4. A state on B{H) is a complex-linear map co : B{H) C satisfying: 

1. co(a*a) > 0 for each a G B{H) (positivity); 

2. a)(l//) = 1 (normalization). 

The state space S{B{H)) is the set of all states CO : B{H) —>■ C. 

Physicists may not like this definition, since it involves non-observable quantities. 
As in the classical case, we may introduce the self-adjoint (or ‘real’) part of B{H): 

B{H\^ = {a&B{H)\a* = a}, (2.25) 

which is a real vector space (though not a real algebra in the usual sense, cf. §C.25). 

Definition 2.5. A state on B{H)sa is a real-linear map (O : B(//)sa —>■ R satisfying: 

1. (o{a^) > Ofor each a € B{H) with a* = a (positivity); 

2. cd(l) = 1 (normalization). 

The state space S{B{H)sa) is the set of all states co : B{H)sa —>■ M. 

Fortunately, there is no need for a fight over this point; the discussion is similar to 
the one below Definition 1.14 and is settled as follows. 

Proposition 2.6. The state spaces S{B{H)) and 5(B(//)sa) rnay be identified: an 
element CO of the former defines an element (% of the latter by restriction, whilst the 
unique decomposition c = a-\-ib (where a* = a andb* =b are given by a = j(c-|-c*) 
and b = — \i(c — c*), respectively) gives Co(c) = (%(«) -f Moreover, 


ft) = 


= 1 . 


( 2 . 26 ) 
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Here the norm on the dual (Banach) space of B(//)sa is given by 

||a)|| =sup{|a)(a)|,aGB(//)sa,||a|| = !}• (2.27) 

This lemma holds for any Hilbert space H (cf. Theorem C.52), but it is instructive 
to restrict our proof to the finite-dimensional setting in which we currently work. 

Proof. The first few claims are immediate from Proposition A.22. To prove (2.26), 
it suffices to prove that for any a G B{H) one has 

|®(a)|< ||fl||, (2.28) 

since by normalization of states the bound is saturated by a = 1//. Furthermore, even 
if (0 is seen as an element of B{H)* rather than B{H)l^, eq. (2.28) needs to be shown 
only for self-adjoint a, for positivity of O) implies the Cauchy-Schwarz inequality 

\0}{a*b)\^ <0}{a*a)0}{b*b), (2.29) 

cf. (A.l), in which we may take a = 1// to find, assuming (2.28) for self-adjoint a, 

\ 0 }{b)\^ < co{b*b) < \\b*b\\ = \\bf, (2.30) 

where the last equality holds for any b G B{H) (turning the latter into a C*-algebra). 
Noting that b*b is self-adjoint, this gives (2.28) for any a. To prove (2.28) for a* = a, 
then, we firstly use (A.47), and secondly use Theorem 2.7 and eq. (2.6) to obtain 

|a)(a)| = |Tr(pa)| = \ Y^pi{Vi,aVi)\ < (2.31) 

i i 

Now let {^j) be a basis of H consisting of eigenvectors of a, so that 

{Vi,aVi) = = \. 

J J 

Since \Xj\ < \\a\\ and Y^iPi = bound (2.28) follows from the estimate 


= ||«||. (2.32) 

i ‘ j i j 

Finally, combining (2.31) and (2.32) gives (2.28) for self-adjoint a. □ 

In view of this, we may work with either S(B{H)sf) or S{B{H))', denoting states 
simply by (O, the context will usually show if it is defined on B(//)sa or on B{H). 
Despite its easy proof, the following result is of fundamental importance. 

Theorem 2.7. If H is finite-dimensional, there is a bijective correspondence be¬ 
tween states CO on B{H) or B{H)sa and density operators p on H, given by 

cu(fl) = Tr(pfl). (2.33) 
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Proof. First note that linear algebra already yields (2.33) as a bijective correspon¬ 
dence between complex-linear maps (O and operators p, for example, because 


{a,b) =Ti {a* b) (2.34) 

defines an inner product on B{H). Positivity and normalization of (O then translate 
to the corresponding properties of p. □ 

The quantum analogue of Theorem 1.15, then, is as follows. 

Theorem 2.8. The state space S{B{H)sf) = S{B{H)) forms a compact convex set in 
the (real) vector space B{H)*^ (in its w*-topology) and, putting the corresponding 
topology on S>(H), eq. (2.33) defines an affine homeomorphism 

S{B{H)) ^ ^(H). (2.35) 


Proof Convexity of S(B{H)) holds by Definition 2.4. For compactness, by Propo¬ 
sition 2.6 the state space S{B{H)) is contained in the closed unit ball of B(H)l^, 
which is compact in the w*-topology (in the case at hand this is simply because 
B{H)l^ is finite-dimensional). It is easy to see that a convergent sequence of states 
actually converges to a state, since both conditions in Definition 2.4 are clearly pre¬ 
served by w* limits (in which (0„~^ (0 iff (On{a) Co{a) for each a G B{H)). □ 

For infinite-dimensional Hilbert spaces eq. (2.35) is false; see §4.2. At the opposite 
end, the case H = C^ provides a beautiful illustration of this theorem (and more). 

Proposition 2.9. The state space S{M 2 {C)) of the 2x2 matrices is isomorphic (as 
a compact convex set) to the closed unit ball B^ = {{x,y,z) G | -\-y^ ^ !}• 

On this isomorphism, the extreme boundary (cf. Definition 1.10) 

dgB^ =S^ = {{x,y,z) G I -hy^ = 1} (2.36) 

corresponds to the set of all density matrices p = Pi^/, where y/ G with || i//|| = 1. 

Proof. Any self-adjoint 2x2 matrix may be parametrized by {t,x,y,z) G as 


p{t,x,y,z) = i 


/ t -hz X— iy\ 

\^x-|-/y t — z J 


(2.37) 


The eigenvalues A, of p(t,x,y,z), computed from its characteristic polynomial, are 

X± = 5 (f ± \/x^ -hy^ -t-z^). (2.38) 

Condition (2.5) yields t = 1. Positivity of p(l,x,y,z) is equivalent to positivity of 
its eigenvalues A,-, which gives x^ -j-y^ -hz^ < 1. For the second claim, note that the 
P\lf are just the one-dimensional projections, which in turn are the density matrices 
satisfying p^ = p (or require A+ = 1, = 0), so x^ -\-y^ = 1. Finally, since 

convex sums fv -I- (1 — f)w in (0 < f < 1) are given by straight line segments 
connecting w and v in it immediately follows geometrically that dgB^ = S^. □ 
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2.3 Pure states in quantum mechanics 

In classical physics, the phase space X arose both as the Gelfand spectrum E{C{X)) 
of the C*-algebra of observables C{X), cf. Definition 1.4 and Proposition 1.5, 
and as the pure state space P{C{X)) of C{X), see Definition 1.10 and Theorem 
1.16. In particular, Z(C(X)) = P{C{X)) at least as sets. Because of this, any pure 
state CO G P(C(X)) is dispersion-free, since as an element of E(C(X)) it satisfies 
(o{f^) = (o{f)^ for any / G C(X). These two definitionally different (but classically 
coinciding) guises of X will fall apart in quantum mechanics; cf. Proposition 2.3. 

Proposition 2.10. /fdim(H) > 1, the Gelfand spectrum E(B{H) ) ofB(H) is empty, 
i.e., there are no nonzero linear maps CO : B(H) -G C that satisfy CO{ab) = CO(a)co(b). 

In particular, there are no nonzero linear maps CO : B{H) -G C that are dispersion- 
free, i.e., satisfy A(o{a) = 0, with 4®(a) = Co{a^) — Co{a)^. 

Proof. Suppose CO G E{B{H)). Multiplicativity fQrb = a = a* implies that co is 
positive, whereas for b = \h i^ implies that co is normalized. Hence co must be a 
state. Now use Theorem 2.7 and use multiplicativity for b = a = a*, implying that 
Ap (a) = 0. This contradicts Proposition 2.3. □ 

On the other hand, the pure state space of B{H) is by no means empty, and despite 
Proposition 2.10, we will see that the special density operators = e^, in (2.7) to 
some extent do play the role of the points xGX. Let us write 

^i{H) = {eGB{H) \e^ = e* =e,Ti{e) = l} (2.39) 

for the set of all one-dimensional projections on //; note that Tr (e) = dim{eH) for 
e G Each e G (H) takes the form e = ey, for some unit vector \j/, see (2.7). 

Lemma 2.11. A density operator p is an extreme point of the convex set I0{H) of 
all density operators on H iff p = p^for some unit vector Xff G H. 

Proof. The argument is similar to the proof of Proposition 1.11. To show that Py/ G 
deS{B{H)), assume Py, = fpi + (1 —f)P 2 for some t G (0,1) and pi,P 2 G S{B{H)). 
Evaluating this equality at a = |^)(^|, where cp E\j/ yields {cp, PiCp) = 0 for i= 1,2, 
so that Pi = P 2 = Py/. Conversely, the spectral decomposition (2.6) shows that p ^ 
deS{B{H)) whenever p f p^f for some unit vector y/ G H. □ 

Consequently, for the moment just as sets (and even as topological spaces), one has 

P{9{H)) = (2.40) 

P{B{H))^ (2.41) 

where the second isomorphism is given by (2.33). Defining a state Oy/ by 

CO^{a) = {\j/,a\j/), (2.42) 

cf. (2.18), the isomorphism (2.41) is the correspondence Wy/ gg e^,, cf. (2.7). 
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This isomorphism becomes more interesting if we note that both spaces are nat¬ 
urally equipped with transition probabilities. For P{B(H)) we canonically have 

=mf{cOxi,{a) \ aeB{H),0<a < l//,a)^(fl) = 1}, (2.43) 

as in (1.38) for A = B{H). Furthermore, on 1^\ (H) we define (with some foresight) 

(2.44) 

Theorem 2.12. The pairs {P{B{H)),t^^^^) and are isomorphic 

as sets with a transition probability. In particular, we have, cf (2.13), 

= |(v/,^)P = Tr(ey,e^) = (2.45) 

Proof. The last equality is a simple computation. The first follows if we can show 
that the infimum in (2.43) is reached at a = etp. To this end, we prove that for any 
0 < a < \h with a),p(fl) = 1 we must have {\j/,a\j/) > \ {(p,Y)\^- Indeed, the condition 
( 0 (p{a) = {(p,a(p) — 1 with ||fl|| < 1 (which follows from 0 < a < 1//) and ||^|| = 1 
imply, by Cauchy-Schwarz, that atp = (p. Since a* =a (by positivity of a), we also 
have fl : (C • (p)-^ —>■ (C • (p)^, so we may write a = eip + a', with a'tp = 0 and a' 
mapping (C- (p)-^ to itself. Then a > 0 implies a' > 0. If ()//,«)//) < \ {(p,\if)\^, then 
{xj/, a'xj/) < 0, which contradicts positivity of a' (and hence of a). □ 

The theory of observables and spectral resolutions of the kind (1.45) may be 
worked out completely for the “quantum” transition probabilities in this theorem: 

Proposition 2.13. 1. There is a bijective correspondence between self-adjoint op¬ 
erators a G B{H) and observables f on (H), d la Definition 1.18.6: 

• Given a self-adjoint operator a, define an observable fa at e-^ G (H) by 

fa{e^’) =Tv{ey,a) = {\l/,a\j/); (2.46) 

• Given an observable f = define an operator af by 

af = Y^Ciei. (2.47) 

i 

2. Each such observable f = fa has a unique spectral resolution as in (1.45), i.e., 

fa= E (2.48) 

A,GCT(a) 

where Sx is the (automatically orthoclosed) subset of IP \{H) whose elements e 
satisfy eH C Hx, where Hx C H is the eigenspace for the eigenvalue X G (7{a). 

3. The product defined by (1.46) - (1.47) is equal to 

fa=faG (2.49) 

faOfb = f{ab+ba)/2- (2-50) 
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Proof. Any spectral decomposition a = puts fa as defined in (2.46) in 

the general form (1.44), with c, = A, and y,- = e^, . The rest should be clear. □ 

We now turn to the quantum counterpart of Proposition 1.13. The main difference 
is that although extremal decompositions of mixed states into pure ones always 
exist, they are no longer unique. For example, for H = C^, we have 

p = diag(2/3,1/3) = |p„i + = i(p^j -bp^J, 

where (mi , M 2 ) is the standard basis of C^, and 

^l = (V^,y/T73), fc = (v^,-y/I73). 

More generally, take any basis (w,) of H = C", assume (2.6), and for each i for 
which y/pwi f 0 (where y/p = ^.\Vi){Vi\), define f,- = Hi/pu'ilP^ 4S well as the 

unit vector 4' = s/pwi/W-^/pwiW. Then p = is an extremal decomposition of 

p. The above example corresponds to the special case f 1 = f 2 = 1 /2, with 

n = 2, PI = 2 / 3 , p2 = 1 / 3 , wi = (I/V2,1/V2), W2 = (I/V2,-I/V2). 

One might require the ^ to be mutually orthogonal, but even that does not imply 
uniqueness of the extremal decomposition; take, for example, p = (1 /«) •!«, where 
1„ is the n X n unit matrix owH = C”. Then any basis induces (2.6). 

Nonetheless, under appropriate assumptions uniqueness does follow. 

Proposition 2.14. 1. Any density operator p on H has an extremal decomposition 

m 

P = Y.PiPvo (2-51) 

i=l 

where m < dim(i/), the pi are probabilities, and the Xf/i are distinct unit vectors. 
2. This decomposition can be chosen such that the \j/i are mutually orthogonal, in 
which case it is unique iff each of the non-zero eigenvalues of p is simple. 

Proof. The existence of the extremal decomposition (2.51) of p follows from its 
spectral decomposition (2.6), which also proves claim 2. If p has some degenerate 
non-zero eigenvalue, the example just given yields non-uniqueness of (2.51). For the 
converse direction, use uniqueness of the decomposition (2.6) under the condition 
that each of the non-zero eigenvalues of p is simple. □ 

In the light of Theorem 2.7, it would be interesting to reformulate Proposition 2.14 
directly in terms of the states on BfH)-, note our standing assumption dim(//) < 00 ! 

Proposition 2.15. 7. Any state CO on B{H) has an extremal decomposition 

m 

CO^YjPi^i^ (2.52) 

(=1 

into distinct pure states COj G P(B{H)), where m < dim(7/), p,- > 0, andY,iPi = 1- 
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2. The unit vectors \j/i that correspond to the pure states (Oi in (2.52) via (2.42) are 
mutually orthogonal (and hence are part or all of a basis ofH) iff 

\\(0i-C0j\\=2{i^j). (2.53) 

3. Extremal decompositions (2.52) satisfying (2.53) exist and correspond bijectively 
to orthogonal families (e,) of one-dimensional projections on H (i.e., CiCj = 5, jet 
andTr{ei) = 1, respectively) for which CO (ei) > 0, Y^i(o(ei) = 1, and 

(o{aei) = Co{eia), a G B{H). (2.54) 

In terms of such a family, the decomposition (2.52) is given by 

Pi = (o{ei); 


Hence an extremal decomposition (2.52) with all COi mutually orthogonal in the 
sense of (2.53) is unique iff the family (e, ) with the above properties is. 

Proof Claim 1 clearly follows from no. 3. To prove (2.53), assume (2.42), so that 

= ^^P{\{Vi,a\l/i) - {\l/j,a\f/j)\, a G B{H),\\a\\ = 1}. (2.57) 

Clearly, |(v/,flV/)| < 1 when ||fl|| = ||v/|| = 1, hence |(v/,■,«(//,■) — {\ifj,a\lfj)\ < 2, 
and the upper bound ||a),' — (B;|| =2 in (2.57) is reached iff |(v/i,flV/i)| = 1 and 
(V/ 2 ,aV^ 2 ) = By Cauchy-Schwarz, this holds iff axjfi = Ay/i as well 

as a\j /2 = —A 1/2 for some A G T. If i/ _L xj/j, then this is accomplished by the 
operator a = \Wi){Vi\ ~ \Wj){V.ib note that a{a) = {—1,1} for dim(//) = 2 and 
a{a) = (—1,0,1} for dim(//) > 2, so indeed ||fl|| = 1 by (A.47). If, on the other 
hand, ((/,,(/,) f 0, then no a with ||fl|| = 1 can meet these eigenvalue equations. 
One way to see this is to reduce to H = C^, since a in (2.57) can be replaced by eae, 
where e is the projection onto the linear span of y/, and y/,-. Picking a basis of 
(with say Ui = t/i), the two eigenvalue equations for a yield a matrix representation 
of a, from which ||a||^ = ||a*a|| may be computed by calculating the eigenvalues of 
a*a and using (A.47). This gives ||fl|| > 1 unless (y/,-, t/y) = 0. 

One direction of the proof of the third claim easily follows from Theorem 2.7: 
any spectral decomposition (2.6) of p provides the projections 

ei = \Vi){Vi\ (2.58) 

of the proposition. For example, eq. (2.54) comes down to [p,e,] = 0, which is 
the case iff e, commutes with all spectral projections of p, which clearly holds for 
(2.58). Uniqueness of the e, then corresponds to uniqueness of (2.6) and hence to 
non-degeneracy of the non-zero eigenvalues p, of p, as in Proposition 2.14. 

The opposite direction, i.e., proving that (2.58) exhausts all possibilities for 
(2.53) - (2.54), is based on the GNS-construction and requires an entire subsection. 


(2.55) 

(2.56) 
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2.4 The GNS-construction for matrices 

The proof of Proposition 2.15 maybe completed on the basis of the GNS-construction 
began in §1.5, which in this subsection we develop for A = B{H), where, as usual, 
dim(//) < oo. In that case, we may use Theorem 2.7 to simplify matters. 

First, to prove (1.76) we use (2.33) and cyclicity of the trace, compute the trace 
by summing over a basis (i), ) of eigenvectors of a*a, say a*aVi = where /r,- > 0 
by positivity of a*a, and use (A.47) (for a*a rather than a) to obtain: 

C0{b*a*ab) = Tv{pb*a*ab) =Y,{Vi,bpb*a*aVi) =Y,Bi{Vi,bpb*Vi) 

i i 

< \\a*a\\Y^{Vi,bpb*Vi) = \\afTvipb*b) = \\afco{b*b), 

i 

where we used {Vi,bpb*Vi) = {b*Vi,pb*Vi) > 0 to justify the inequality. 

We now explain all cases of interest, paying special attention to the commutant 

TtfoiA)' = {B e B{Hio) I 7tioia)B = B7to}ia)'iaeAy, (2.59) 

to distinguish operators on H from operators on Ha, we write the latter in capitals. 
For simplicity we also put H = C" (with the standard inner product), so that 

B{H)=Mn{C), (2.60) 

and all operators are matrices. Performing a suitable unitary transformation or 
change of basis if necessary, we also assume that the unit vectors i), in the spec¬ 
tral decomposition (2.6) of p form (all or part of) the standard basis (di, ..., Vn) of 
C”. As in (1.74), we denote the null space by 

Np = {aeB{H)\Tv{pa*a)=0}. (2.61) 

• If p = |Uj)(Uj|, the corresponding pure state (2.42) is (o{a) = {Vj,aVj), with 

Np = {a &A \ aVj = 0}. (2.62) 

Hence a G Np iff the j’th column Cj{a) of a vanishes, so we have a — b G Np iff 
Cj{a) = Cj{b). Thus the equivalence class ap G M„{C)/Np may be identified with 
Cj{a). Consequently, we obtain 

Hp=M„iC)/Np^Cy (2.63) 

under the unitary isomorphism u:Hp ^ C”, ap i—>■ Cj{a), with inverse : z i—>■ ap, 
z G C", where a is the matrix with Cj{a) = z and zeros elsewhere (i.e., aij = Zi and 
<^ik = 0 for all i and k ^ j). We likewise write = bp, with bij = w, and bik = 0 
for all i and k y j. With uap = z and ubp = w, we obtain (beware: no sum over y!): 

{ap,bp) = Tv{pa*b) =Y^^bij = Y^ZiWi = (z,w)c" = {uap,ubp)c'<- 

i i 
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The GNS-representation Tip, originally given on Hp by (1.77), is accordingly trans¬ 
formed to u7tp{a)u^^ = ftp on C", which is given by 

7tp{a)w = u7tp{a)bp = u{ab)p = Cj{ab) = aw, 

and the cyclic vector uQp G C" is just the basis vector Vj from which we started. 
More generally, for apure state (2.42) the GNS-representation n(a,^{Mn{C)) is equiv¬ 
alent to the defining representation on C", with canonical cyclic vector \j/. Finally, 
since only multiples of the unit matrix commute with all matrices, it follows that 

;r<a^(M„(C))'^C. (2.64) 

• The ‘opposite’ case occurs when p is invertible, in other words, when the sum 
over i in (2.6) has n nonzero terms. Hence 

Tr(pa*fl) = ^p,j|at),j|^ (2.65) 

i=i 

vanishes iff at), = 0 for each i, i.e., a = 0, so that Np = {0} and hence 

Hp=Mn{C). (2.66) 

The GNS-constructed inner product on Mn{C), cf. (1.78), given by 

{ap,bp)=Tv{pa*b), (2.67) 

may be transformed into the usual one (2.34) by the following linear map: 

u:M„{C) ^ M„(C); (2.68) 

uap = (2.69) 

This map is unitary from the Hilbert space (M„(C),(•,•)p) to the Hilbert space 
(MniC), (•,•)), for it is invertible, with inverse w^^a = app^^/^, as well as isometric: 

{u{a),u{b)) = Tx{p^l^a*bp^l'^) = Tx{pa*b) = {ap,bp). 

The transformed representation ftp = uTtp{a)u^^ on M„{C) is simply given by 

ftp{a)b = ab, (2.70) 

and the cyclic vector uQp in M^iC) becomes p'/^, so that, as in (1.73), 

(p'/2,^p(a)pi/2)=Tr(pa). (2.71) 

In this case, the commutant is easily computed to be 

ftp{M„iC)y^Mn{C), (2.72) 
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since any linear map C : Mn{C) —>■ M„{C) that satisfies Ciab) = aC{b) for each 
G Mn{C) is of the form C{a) = ac = Rc{a) for some c G M„(C), namely c = 
C(l); to see this, just take b = 1. Since this involves right multiplication by c, 
which messes up the order in that RcRd = Rdc, one has a choice in implementing the 
isomorphism (2.72) either as a linear anti-homomorphism (of algebras) C Rc, or 
as an anti-linear homomorphism C i—^ Rc* (see also Theorem C.159). 

Further insight into the structure of this representation comes from the realization 

M„(C)^C”(g)C", (2.73) 

as Hilbert spaces under the unitary map v : a i— Y.ijOijVi 0 Vj. This yields 

v^p(fl)v* = a® 1„, (2.74) 

as an operator on C” 0C", and indeed for any Hilbert spaces H\,H 2 one has 

(B(//i)(g)C - 1/rJ' =C - 1/r, (g)B(//2). (2.75) 

• Finally, in the ‘intermediate’ case the sum in the spectral decomposition (2.6) has 
I <m <n nonzero terms. Using the ensuing (partial) basis (ui of C™ (viz. 

C”), analogously to (2.66) with (2.73) we obtain, up to unitary equivalence. 


Hp ^ C"(g)C'"; 

(2.76) 

^p(^) — Im? 

(2.77) 

n 

1=1 

(2.78) 

%p{Mn{C))' ^ MmiC). 

(2.79) 


The relevance of all this to the decomposition of states on B(N) is as follows. 
Proposition 2.16. Let co be a state on B{H) = M„(C). Then each decomposition 

CO = Y^pm, (2.80) 

i 

where the pi are probabilities (but the states COi are not necessarily pure) is induced 
by a family (A,) of nonzero operators in the commutant %(o(B(H))' that satisfy: 


0<A,<1; (2.81) 

^A, = l. (2.82) 

i 

Namely, given such a family of operators Aj, the decomposition (2.80) is given by: 


Pi — {^(OjAiQco)'-, 


(Oi{a) 


Tt(a(a)AiQ,(f) 


(2.83) 

(2.84) 
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Proof. The claim that such a family yields (2.80) is trivial, except for the remark that 
automatically p, > 0, since = 0 would imply ^/Ai£2(Q = 0 and hence 

= Tt(jy(a) \fAiQ,(o = 0 

for any a G B{H); by (1.72) this gives — 0 and therefore A, = ^/Af' = 0. 

Conversely, each state cOi in (2.80) defines a sesquilinear form Qi on by 
Qi{a(o,b(o) = COi{a*b), which is well defined by (Oi{a*a) < (o{a*a) and (A.l), and 
is positive because 0), is a state. Proposition A.23 then provides us with a positive 
operator A; for which Qi{ao),ba) = {ua^Aiba), hence (Oi{a*b) = {ao),Aiba). Next, 

{aa,Aina>{c)b(o) = {aco,Aiicb)m) = (Oi{a* cb) = {{c*a)aAibm) = {am,7t(o{c)Aib(o), 

so A, G %(o{B{H))' . Finally, the bound (2.81) corresponds to 0 < p, < 1 in (2.80), 
whilst (b( 1) = 1, or equivalently Y.iPi = 1, yields (2.82). □ 

We now complete the proof of Proposition 2.15. We assume (2.33), where we 
initially take p to be invertible. We omit the hat in (2.70) as well as the suffix (O 
or p on vectors. As noted, we then have Qp = and we also know that A, is 
given by Aib — bat for some a, G M„{C), viz. a, = A,Tn (where 1„ = 1// is to be 
distinguished from Qp = p^^^). In this case, (2.81) means 0 < Tr(h*ha,) < 1 for 
each b with Ti{b*b) = 1, which is true iff 0 < a, < 1, whereas (2.82) immediately 
yields = 1. In terms of such a family (a,) in M„(C) itself, the decomposition 
(2.80) of (B = Tr(p —) into arbitrary states CB; follows from (2.83) - (2.84) as 

Pi = Tr(pfl;); 

CB,(fl) = Tr(p,fl); 

_ p‘/^a.p‘/^ 

Tr(pfl;) 

To obtain pure and orthogonal states (Oi, we subsequently ask when the new density 
matrices p, are mutually orthogonal one-dimensional projections p, = |t;,)(B,j. 

To answer this, we use the spectral theorem (A.37) - (A.38) applied to p, which 
gives p = Y.j Pj^i and hence p y/p]ej, so that 

p ^/^a,p = Yj VPjPk^J^iek- (2.88) 

j-k 

This can only be proportional to a one-dimensional projection if each a, is a one¬ 
dimensional projection that commutes with all spectral projections ej of p (and 
hence also commutes with p itself), and all further constraints on the a, may then 
only be satisfied if a,- = |i),)(l),j, for some basis (b,) of eigenvector B; of p. 

A similar analysis applies to non-invertible p, the only new point being that pro¬ 
jections e, orthogonal to the range of p fall into the null space Np, cf. (2.76) - (2.79), 
and hence do not contribute to (2.52), so that they may be ignored. □ 


(2.85) 

( 2 . 86 ) 

(2.87) 
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2.5 The Born rule from Bohrification 

The Bohrification approach to quantum mechanics studies noncommutative alge¬ 
bras of observables like B(//) through their commutative subalgebras. In this section 
we show how the Born rule (2.8) emerges from that perspective. Our discussion is 
based on the interplay between the three kinds of (finite-dimensional) C*-algebras: 

• C(X) is a C*-algebra under the pointwise operations (1.18) - (1.20) and the 
supremum-norm (1.24); we still assume that X is finite. 

• B{H) is a C*-algebra under the pointwise operations (2.22) - (2.24) and the op¬ 
erator norm (A. 18); our standing assumption remains dim(//) < 

• C* (a) is the C*-algebra generated by a S B{H) and 1 h (i-e., the intersection of all 
unital C*-algebras in B{H) that contain a). If a* = a, then C* (a) is commutative. 

Each of these is unital, since C{X) has a unit 1^ (i.e. the function x i—1), B(H) 
has a unit 1// (i.e. the operator \j/ i—> xj/), and C*(a) shares the unit 1h- The first 
two classes overlap just in case dim(//) = 1 and X is a singleton (in which case 
B(C) = C(*) = C); otherwise, the fundamental difference between the two is that 
C(2f) is commutative in that fg = gf for all f,g, whereas B{H) is non-commutative. 
However, the system of C*-algebras C*{a) within B{H), where a G B{H)sa varies, 
to some extent bridges the gap between the commutative and the non-commutative 
worlds. This relatively simple situation goes to the heart of exact Bohrification. 

Theorem 2.17. Let a* = a G B{H), where H is a finite-dimensional Hilbert space. 

1. The commutative C*-algebra C*{a) consists of all polynomials in a. 

2. Any element of C* (a) is a linear combination of the spectral projections ex of a. 

3. For functions / ; (7(fl) —>■ C, the map f i—>■ f{a) defined by 

f{a)= ^ f{l)-ex. (2.89) 

Xe(y(a) 

gives a (necessarily unital) isomorphism of commutative C*-algebras 

C((j(a)) ^ ^ C*(a). (2.90) 

Proof Noting that any function on the finite subset (7(a) of K is continuous, this is 
a restatement of Theorem A.15 for finite-dimensional Hilbert spaces. □ 

We now come to the main point. States on unital C*-algebras A may be defined 
just as in Definitions 1.14 and 2.5, i.e. as positive linear functionals (0 :A ^ C that 
satisfy (u(l/i) = 1 (cf. Proposition C.5). Recall Theorem 1.15 and Theorem 2.7. 

Theorem 2.18. Let co be a state on B{H), represented by a density operator p via 
(2.33), and let a G B{H) be a self-adjoint operator. Then the restriction of (0 to 
C*{a) C B{H) is a state, which also induces a state C0|c((j(a)) C{(7{a)) through 

(2.89) - (2.90), i.e., (i\c(a{a)){f) = ®(/(q))- The probability measure on <j(a) that 
corresponds to the state 0)\c{a(a)) on C{<j{a)), then, is given by the Bom rule (2.9). 
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Proof. First, the restriction of a state on a given unital C*-algebra to a unital C*- 
subalgebra remains a state. Second, isomorphisms of unital C*-algebras pull back 
to state spaces in that, if ^ : A —B is an isomorphism, and O) is a state on B, then 
^*(0 ; A —>■ C is a state on A, where ^*{a) = (o{(p{a)). We now compute 

0}\c{a{a))if) = ®(/(a)) =Tr(p/(fl)) 

= Y. Tr(pa)/(A)= Y Paim^) 

X€(y{a) X€o{a) 

= EpAf), (2.91) 


where, from left to right, the first equality is just the definition of 0)\c{a(a))^ whereas 


the others in turn follow from (2.33), (2.89), (2.8), and (1.9), respectively. □ 

Note that Theorem 2.18 implies Theorem 2.2. The simplest nontrivial illustration is: 

H = C”; (2.92) 

(0 = co^; (2.93) 

W=YciUi-, (2.94) 

i=i 

a = diag(Ai,...,A„) = ^ A; | m,■)(«,• |, (2.95) 

1=1 


with respect to the standard basis («,) of C", with all A,- G K different, cf. (2.42). The 
C*-algebra C*{a) = C" then consists of all diagonal matrices 


h = diag(hi,...,h„). 


(2.96) 


Since obviously 


(7(13) — ,..., , 


the isomorphism (2.90) is given by 


/i-^diag(/(Ai),...,/(A„)). 


(2.97) 

(2.98) 


The computation (2.91) in the proof of Theorem 2.18 then becomes 

®vlC{cT(ii))(/) = (ridiag(/(Ai),...,/(A„))v/) = Y 

1=1 

= YpXW{^i), (2.99) 

/=1 

from which the Born probabilities pa may be read off as the familiar expressions 

Paik) = |Ci|2. (2.100) 
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For an analogous treatment of the generalized Bom rule (2.21), we first refer to 
Definition A.16 for the the pertinent definitions, especially of the joint spectmm 

a(q) c a(ai) x • • • x a(a„) c M” 

of a family a = (ai of commuting self-adjoint operators. As in the case of a 

single operator, we define C*(a) as the smallest unital C*-subalgebra of B(N) that 
contains each a,. Generalizing Theorem A.15, we have: 

Theorem 2.19. Let a=(ai,...,a„)be commuting self-adjoint operators on H. Then 
C*{a) is commutative, and there is a unique isomorphism of C*-algebras 

C*{q)^C{a{q)), (2.101) 

under which \h G C*{a) corresponds to the unit function : Am- 1 in C((7(a)), 

and a, G C*{a) corresponds to the projection TT,-: A i-G A,- in C((7(fl)). 

For further discussion, see Appendix A, Theorem A. 17. 

Theorem 2.18 may then be generalized in the following way, with similar proof. 

Theorem 2.20. Let (O be a state on B{H), represented by a density operator p, and 
let a = (ai,... ,fl„) be commuting self-adjoint operators on H. Then the restriction 
of 0) to C*{a) C B{H) is a state, which induces a state 0)\c{a{a)) on C{(7{a)) through 
the isomorphism (2.101). Then the probability measure on the joint spectrum <7(fl) 
that corresponds to 0)\c{a{a)) given by the generalized Born rule (2.21), i.e., 

p,(A)=Tr(peJ. (2.102) 

Strictly speaking, in the present context one should restrict (2.21) to A G <T(fl), but 
the claim is correct even if one does not, for the (Born) probability assigned to values 
A G O’(ai) X • • • X (y{a„) that do not lie in O’(a) is simply zero. 

As shown in Proposition A. 19 in Appendix A, the multi-operator case is a spe¬ 
cial case of the single-operator case, in that C* (a) = C* (a) for a suitable self-adjoint 
operator a. Since the converse is obvious. Theorems 2.18 and 2.20 are equivalent. 
Corollary A.20 in Appendix A even shows that any unital commutative C*-algebra 
C in B(N) takes the form C = C* (a) for some self-adjoint operator a G B{H). Com¬ 
paring the restrictions of a state co on B{H) to C as the latter varies therefore comes 
down to asking how the various Bom probability distributions pa on C*(a) are re¬ 
lated to each other as a varies. It is clear from (2.8) that if pa and pt, come from the 
same density operator p (as the notation indicates), then for A G G{a) and p G G(b), 

4"*^ = 4^^ ^ Pa{^)= Pb{B)- (2.103) 

Indeed, this is the only compatibility condition between pa and pi,, showing that 
Pa{^) only depends on a and A through the associated spectral projection e^^\ Con¬ 
dition (2.103) is a version of a general property of quantum mechanics called non- 
contextuality, which in this case means that, given its spectral projection e^^\ the 
‘context’ operator a is otherwise irrelevant for the Born probability Pa(A). 
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2.6 The Kadison-Singer Problem 

It should be clear from the example in the previous section that pure states (0\ff on 
B{H) may well give rise to mixed states on C*(a); referring to (2.94) and (2.100), 
this is the case whenever c, ^ 0 for more than one value of the index i. If, on the other 
hand, c, 0 for just a single value i = j, then \j/ — uj (up to a phase), or, equivalently, 
G)y/(a) = {uj^auj). In that case, the given state (Oxf/ is pure both on B(H) and on 
C*{a), and the associated probability measure ft)^|c{cy(£i)) on the spectrum a{a) is 
supported by a single point, namely Xj G (7(a). 

This example suggests a general problem (first posed in the non-trivial case 
where // is infinite-dimensional by Kadison and Singer in 1959) that is of great 
relevance for the Bohrification program. Namely, let A be a maximal commutative 
unital C*-algebra in B(//) and let a>A be a pure state on A. We may then ask: 

1. Does coa have an extension to a state O) on B(//) at all (i.e., (0\a = GU)? 

2. If so, is (0 uniquely determined by its restriction 0 ) 4 ? 

3. Either way, if (O exists, can it be chosen so as to be pure (assuming (Oa is)? 

If dim(//) < 00 , all these questions are easy to answer at one stroke: 

Theorem 2.21. Let dim(//) < 00 and let COa be a pure state on a maximal commu¬ 
tative unital C*-algebra A in BiH). Then (Oa has a unique extension to a state CO on 
B{H), which is necessarily pure. 

Proof. As explained after the proof of Corollary A.20 in Appendix A, we may sim¬ 
ply assume that H = C" and that A consists of all diagonal matrices; call this col¬ 
lection D„{C) (for every other case is unitarily equivalent to this one). Clearly, 

(2.104) 

from which we see that if (Oa is pure, then it must be given on b G Dn{C) by 

(OAib)=bj, (2.105) 

for some j, cf. (2.96). If (0 exists, it is given by (2.33). Using (2.6), condition (2.105) 
then enforces the following constraint on the p, and u, (where (m,) is the standard 
basis of C" and (i),) is an orthonormal set diagonalizing the density operator p): 

Y,Pi\{ujxVi)\^ = l. (2.106) 

i 

Since p, = 1 and | {uj,Vi) \< 1, eq. (2.106) can only hold, for given j, if 


\{uj,Vi)\ = l (2.107) 

for all i with p, > 0. Since Uj is a unit vector whilst the (Vi) are an orthonormal set, 
(2.107) can only be true if there is a single i for which Pi > 0, namely i = j (and 
hence pj = 1), in which case Vj must equal Uj up to a phase. Hence p = |M,)(M,j, 
which shows that p exists, is unique, and is pure. □ 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



58 


2 Quantum mechanics on a finite-dimensional Hilbert space 


At least in operational interpretations of quantum mechanics, this theorem implies 
that a pure quantum state (i.e., on B{H)) is completely determined by the outcome 
of a measurement of some maximal observable a, whose outcome, after all, gives 
one of the eigenvalues Xj in (2.95) and hence fixes the post-measurement state to be 
the one given by (2.105). This is, indeed, a typical way of preparing a state. 

As one might expect, this is no longer true if A = C*{a) fails to be maximal (in 
which case a measurement of a would not provide enough information about the 
quantum state). Namely, suppose a = Y.Xec:(a) A • as in (A.37); the maximal case 
occurs iff Tr (e;^) = dim(//;i^) = 1 for all A € (7(a) (equivalently, all eigenvalues A, in 
(A.37) are different). If not, suppose dim(i/;L) > 1 for some A. Then any unit vector 
y/ G Hi gives rise to a pure state on B{H), which remains pure on A (it is given 
by (0^,^A{a) = A and hence induces the Dirac probability measure di on (7(a)). 

Dropping the purity condition on cUa loses uniqueness of the extension O), too, 
even if A is maximal: take b — diag(foi ,bn) GA=D„ (C), and assume that 

C0A{b) =Y^Pibi (2.108) 

i 

has more than one term (with p, > 0 and Y,iPi = 1 as always), cf. (2.105). Then: 

• any pure state (Oxi/ as in (2.94), such that |c, p = p, for all i, extends (Ba; 

• the “decohered” mixed state (O = LiP!|^i)(l^i| extends ©a, too. 

Further insight in the state extension problem comes from the following result. 

Proposition 2.22. Let A be any unital C*-algebra in B{H) (i.e., A is not necessarily 
commutative) and let COa be a pure state on A. Then the set 

Sa = {(OG S{B{H)) I (B|a = cba} (2.109) 

of all states on B{H) whose restriction a)|A to A is the given state (Oa, is a compact 
convex subspace of the total state space S{B{H)) ofB{H), whose extreme boundary 
dgSA consist of pure states on B{H), i.e., dgSx C P(B(H)). Consequently, COa has a 
unique extension to a state on B{H) iff it has a unique pure extension. 

Proof. Convexity and (w*) compactness are obvious. Let co € B^Sa and suppose 
(0 = tCOi + (1 — 1)(02 for some t G (0,1) and (Oi ,(02 G S{B{H)). By assumption, 
(Oa = (0\A =tCBi|A + (l —t)(02\A is pure on A, so a)i|A = (02\a = hence 0 ) 1,(026 Sa. 
Since O) G dgSA, this implies (Oi = (O 2 = (0. Hence (O is pure on B(H). 

Finally, Sa is a singleton iff its boundary dgSA is (since any state in 5a has a 
convex decomposition in terms of states in its boundary), yielding the last claim. □ 

This proposition remains true for infinite-dimensional H (and even for arbitrary 
C*-algebras), but Theorem 2.21 becomes much more complicated. As we shall see, 
maximal commutative unital C*-subalgebra of B{H) are no longer unique up to 
unitary equivalence, and the validity of the claim depends on which type of maximal 
subalgebra is considered. Also, the proof of what then is called the Kadison-Singer 
Conjecture becomes extremely difficult (with questionable relevance to physics). 
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2.7 Gleason’s Theorem 

Gleason’s Theorem answers the following question in the positive: given probability 
distributions pa on G{a), for each self-adjoint operator a G B(H), satisfying (2.103), 
is there a single state co on B{H) inducing these probabilities through the Born rule? 
This question is closely related to various others that involve equivalent structures, 
cf. Definition 1.1. We denote the unit sphere in Hhy H\ = {y/ G H, || v/|| = 1}, and 
write 3^{H) = {e G B{H) \ = e* = e} for the set of all projections on H. 

Definition 2.23. Let H be a finite-dimensional Hilbert space, with unit sphere H\. 

1. A probability distribution on ^{H) is a map p : Hi ^ [0,1] that satisfies 

dimH 

^ p{Vi) = 1, for any basis (u,) ofH. (2.110) 

1=1 

2. A probability measure on ^{H) is a map P : 3^{H) —>■ [0,1] that satisfies: 

P{e + f) =P{e)-\-P{f) whenever ef = 0 ^ eH ± fH', (2.111) 

P(1h) = 1. (2.112) 

Note that p is really defined on d^\ (H), for we have p{zv) = p(v) for all z G T and 
U G //i; to see this, extend zv and u to a basis of H in the same way and use (2.110). 
As in Definition 1.1, these notions of probability are equivalent, cf. (A.28): 

• Given a probability measure P, one obtains a probability distribution p by 

p{v)=P{ey). (2.113) 

• Given a probability distribution p. Lemma 2.24 below guarantees that 

P{e)= ^ p{Vi), (2.114) 

1=1 

where (u,) is any basis of eH, defines a probability measure P. 

Lemma 2.24. If p is a probability distribution on IP{H) and L d H is a linear 
subspace, with basis (Vi), then p('^i) independent of this basis choice. 

Proof Extend (u,) to a basis of H by adding a basis (l)') of L^. Take another basis 
(vf) of L and complete it to a basis of H by using the same basis (Vj) of L^. Then 

=E^’(^r)+E^’(^7) = 1’ (2.115) 

> j i j 

where we once again used (2.110). Hence = IliP(vl'). □ 
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Clearly, a state (O on B{H) induces a probability measure P on 3^{H) by 

P(e) = a)(e)=Tr(pe), (2.116) 

where p is the density operator associated to (O, as in (2.33). Therefore, it is a natural 
question if any probability measure on 3^{H) is induced by some state on B{H) by 
(2.116). This question is equivalent to the one above; 

Proposition 2.25. • A probability measure P on I^{H) induces non-contextual 
probability distributions pa on o{a) for each self-adjoint a G B{H) by 

p,(A)=P(ef); (2.117) 

• Conversely, a family (pa) of non-contextual probability distributions (i.e. satisfy¬ 
ing (2.103)) gives rise to a probability measure P on by 

Pie)=pe{l). (2.118) 

Proof As defined by (2.117), pa is a probability distribution on a{a): by (A.38), 

^ Pa{l)= E pU(^)^p( ^ 4“)^ =P(1^) = 1. (2.119) 

Xea(a) leaia) \Xeaia) ) 

Conversely, suppose ef = 0. Introduce g= 1 — e — f, and consider the self-adjoint 
operator a = Xie-\- A 2 /+ X^g, for three different real numbers Ai, A 2 , A 3 . By (2.103), 

p(e) = Peil) = PaiXl), P{f) = Pf{l) = Pa{X2), P{g) = Pg{l) = Pa{X3). 

Furthermore, since a(a) = jAi, ^ 2 ,^ 3 }, we have PaiXi)-\- pJXt) + PaiXs) = 1 and 
hence P{e) + P{f) + P{g) = 1. Also, p(e + f) + P{g) = P{e + f + g) = P{Ih) = 1. 
The last two equations give P{e + /) = P{e) +P(/). □ 

Suppose is a family of projections on H such that = 1// and = 

Such a family generates a commutative unital C*-algebra C = C* (ei,..., eN) 
in B(H), which coincides with C*{a) for a — where all A, G K are differ¬ 

ent, so that (7(a) = {Ai,... jAat}. All commutative unital C*-algebras in B{H) arise 
in this way, and C is maximally abelian iff A = dim(//), i.e., iff each e, is one¬ 
dimensional. The point is that a probability measure P on induces a state (Oc 

on each C = C*(ei,.. ■,eN) (or, for C = C*(a), a probability measure Pa on (7(a)): 

1. if a G C is self-adjoint, then we have unique spectral resolutions (A.37), and put 

(Oc{a)= ^ XP{ex). ( 2 . 120 ) 

Xea(a) 

2. \ic = a-\-ib GC with a and b self-adjoint, we define (Oc{c) = (Oc{a) -\- i(0c{b). 

By Lemma 2.24, the map (Oc thus defined coincides with the linear extension of the 
map e, 1 — P{ei) to C, which also shows that (Oc in linear. Clearly, (Oc is a state on C. 
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Again by Lemma 2.24, the ensuing family of states (Oc on all commutative unital 
C*-algebras C C B(H) is non-contextual (or, one might say compatible) in the sense 
that if G CnC', then (Oc{b) = (Oc'ib). In particular, if C' C C, then ft)c|c' = ®c 
(where (0(j\c' '^he restriction of (Oc to C'). It is convenient to extend this non- 
contextual family {(Oc) of states to a well-defined map (O : B{H) — C by putting 

(o{a + ib) = (Oct(a){a) + *®c*(i))(^)) G B{H),a* = a,b* = b. (2.121) 

Definition 2.26. A quasi-state on B{H) is a map co : B{H) C that is positive 
((0{ a*a) > 0) and normalized (©(I//) = \), cf. Definition 2.4, and otherwise: 

1. satisfies (o(a) = (o(a') + i(o{a"), where a' = j{a + a*) and a" = —^i{a — a*). 

2. is linear on each commutative unital C*-algebra in B{H). 

Note that a' and a" are self-adjoint, so that (O is fixed by its values on B{H)^^. Hence 
we have (o{za) = z(o{a), z G C, and (o{a + b) = (o{a) + (o{b) whenever ab = ba. 

Proposition 2.27. The map co : B{H) —>■ C defined by (2.120) and (2.121) is a quasi¬ 
state on B{H). Any quasi-state on B{H) arises in this way, giving a bijective corre¬ 
spondence between quasi-states on B{H) and probability measures on I^{H). 

Proof. The first claim holds by construction. Conversely, a quasi-state (O yields a 
probability measure P via P{e) = (o{e), cf. (2.116). □ 

Theorem 1.15 shows that each state on C{X) is induced by a probability measure 
(and, trivially, also the other way round). Although Theorem 2.7 is already a quan¬ 
tum version of Theorem 1.15, an even better parallel would involve the probability 
measures of Definition 2.23. This is indeed what Gleason’s Theorem achieves, en 
passant answering all versions of our lead question: 

Theorem 2.28. Let H be a finite-dimensional Hilbert space of dimension > 2. Then 
each probability measure P on !3^{H) is induced by a unique state (0 on B{H) via 

P{e) = (o{e). (2.122) 

Equivalently, each probability distribution p on jf^{H) is given by 

p{v) = {v,pv), (2.123) 

where p is a unique density operator on H. Hence every quasi-state is a state. 

This completes the following list (of which 1-5 do not require Gleason’s Theorem). 

Corollary 2.29. Let H be a finite-dimensional Hilbert space. The following notions 
are equivalent (i.e., there are natural bijective correspondence between): 

1. Hon-contextual families of states on commutative unital C^-algebras C (G B{H); 

2. Non-contextual families of probability measures on spectra (7{a), cf. (2.103); 

3. Probability distributions on t^{H); 

4. Probability measures on 3^{H); 

5. Quasi-states on B{H); 

6. States on B{H). 
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2.8 Proof of Gleason’s Theorem 

The difficulty of Theorem 2.28 should already be clear from the fact that it is false 
if dim(//) = 2: as we have seen in (2.37), a state on M 2 (C) = B(C^) is given by 
three real parameters, whereas a probability measure P on can assign arbi¬ 

trary values P{e) to one-dimensional projections e, as long as P(1 — e) = 1 —P{e). 
Equivalently, this time from the perspective of probability distributions p, each unit 
vector in belongs to a unique basis (up to a phase), so that p can assign an arbi¬ 
trary value to one of the two vectors in each basis and is unconstrained otherwise. 

In higher dimensions, however, one-dimensional projections always belong to 
infinitely many orthogonal sets, whilst unit vectors belong to infinitely many bases. 
This constrains the possible values P or p may take, and these constraints turn out 
to be strong enough to enforce (2.116). 

The proof of Theorem 2.28 consists of two nontrivial parts, the second of which is 
notoriously difficult. By exception in quantum-mechanical reasoning, both involve 
as a real Hilbert space, whose elements x = {x,y,z) have standard inner product 

{x,x') =xx'+yy'+ zz!, (2.124) 

with the ensuing (Pythagorean) norm and (Euclidean) notion of orthogonality. 

Proposition 2.30. If Theorem 2.28 holds for the real Hilbert space then it holds 
for any complex finite-dimensional Hilbert space of dimension > 2. 

Proposition 2.31. Theorem 2.28 holds for the real Hilbert space 

Proposition 2.30 is a conjunction of two lemmas. 

Lemma 2.32. If (2.123) holds for where p is some symmetric operator, then 
(2.123) holds for C^, where p is a self-adjoint operator. 

Neither positivity nor normalization of p play a role in the argument; once we have 
(2.123) in this more general sense, the conclusion that p be a density operator triv¬ 
ially follows from the definition of p. This also applies to the second sublemma. 

Lemma 2.33. (2.123) holds for C^, then it holds for for any complex finite¬ 
dimensional Hilbert space of dimension > 2. 

It will be convenient to extend p: H\^ [0,1] to a function Q ; R by 

G(0) = 0; (2.125) 

= (r^O), (2.126) 

so that (2.123) is evidently equivalent to the analogous expression 

Q{\j/) = {\l/,p\l/) (xI/GH). (2.127) 
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Given (2.127), the minimax principle for real symmetric matrices implies that Q is 
maximized on //i by y/ G //i iff p y/ = A \j/, where X is the largest eigenvalue of p. 

Proof of Lemma 2.32. Suppose p : Cj —>■ [0,1] is a probability distribution (in the 
sense of Definition 2.23). The first step shows that p assumes a maximum on the unit 
sphere Cj (note that Cj is compact, but we do not know yet if p is continuous!). 
Since 0 < p{v) < 1 for u G Cp M = sup{p(u),u G Cj} exists, and there is a 
sequence (i)„) in Cj for which p{v„) —^ M. Since Cj is compact, this sequence has 
a convergent subsequence, with limit Voo G Cj. Furthermore, we may assume that 
(v„,Voc} G K, for if not, we change to u' = with = (Uoo, t;„)/|(t)„,i)oc)|. 

For each fixed n (with v„ in the convergent subsequence in question), the real 
linear span of Voo and v„ is isomorphic to as a Hilbert space (with standard inner 
product), embedded in any C one likes (where, once again, is seen as a real 
Hilbert subspace in the sense that all inner products of vectors in R^ are real). By 
assumption, (2.123) holds on R^ and hence also on R^ C R^, so that, in particular, 

|p(t)oc)-p(t;„)| = |(t;„o,pt;oc)-(t>„,pt)„)| = |((t)„o-'i)„),p(t)oc + t)„))| 

< ||p||||t>oo + t)„||||t)oo-'i>„|| < 2||p||||t;„c-'i)„||, 

since ||i;„o + t>n|| < ||tIoo|| + ||t>«|| and ||t)oc|| = ||t>„|| = 1. Consequently, 

\p{v^)-M\ < \p{v^) - p{Vn)\ + \p{Vn- M\ <2\\p\\\\Vo. - Vn\\ + \p{Vn) - M\, 

so letting makes both terms on the right-hand side vanish. Hence p(Voc) = M. 

For reasons to become clear soon, we relabel Uoo = Di. Take any Do G C j with 
(uo, Ui) = 0 and consider the real Hilbert space R^ C spanned by Ui and Uq. By 
assumption, (2.127) holds, and by the minimax principle, pvi = XiVi = p(vi)vi, 
with p(ui) =M. Hence for any v = toVo + LVi, with fo,fi G R, we have 

Q(v) = (toVo + tiVi,p(toVo + tiVi)} = |to|^p(t)o) + kiPp(ui)- (2.128) 

We claim that this also holds for complex coefficients foTi G C. Indeed, by (2.126), 

e(fot;o + fiUi) = + = |foPp(T'o) + kiPp(ui), (2.129) 

where we used (2.128) with Vq = /\{t(i/t\)\vQ instead of Do; this is still a 

vector orthogonal to Ui, and we also used Q(Vq) = p(Vq) = p(vo). 

We now repeat this analysis on the part (Cj)xui of that consists of all unit 
vectors orthogonal to Ui, which remains compact. Thus p assumes a maximum at 
some unit vector V2 G (Cj)xui, and we may complete the pair (i)i,l)2) to a basis 
(ui, 1 ) 2 , tts) of C^. With Do = f 2 'D 2 + f 3 U 3 , the above argument (on (Cj)xui) gives 

P(Vo) = Q(Vo) = \t 2 \^piV 2 ) + \ti\^p{V3). (2.130) 

Combined with (2.129) at fo = 1, this gives, for any coefficients tif 2 ,t 3 G C, 
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Q{tlVi+t2V2 + t3V3) = \ti\^p{Vi) + \t2\^p{V2) + \t2\^piV3). (2.131) 

Hence (2.127) holds on all of C^, with 

P =^>(Ui)|t)l)(t)l|-f;5(t)2)|fi>2)(fi>2|+7>(fi>3)|V3)(fi'3|- □ 

Proof of Lemma 2.33. Let // be a complex finite-dimensional Hilbert space of di¬ 
mension > 3, equipped with a probability distribution p, and define Q : // —> K by 
(2.125) - (2.126). We need to prove (2.127) for some self-adjoint operator p. By 
Propositions A.4 and A.23, this is equivalent to Q being a quadratic form. Since 
(A. 8 ) evidently holds, we just need to prove (A.9). Take any three-dimensional 
Hilbert space Lj, C H containing v and w. By assumption, there exists a self-adjoint 
operator pi^ on L 3 for which (2.127) is valid for all xj/ G L^. Taking xj/ = v, xj/ = w, 
x)/ = v + w, and xj/ = v — w then validates (A.9). This completes the first proof. 

This lemma may also be proved without invoking Proposition A.4, as follows. 

If V and w are linearly independent, they are contained in a unique two-dimensional 
subspace L 2 C H, which in turn is contained in a (non-unique) three-dimensional 
subspace L 3 C H. Take p^ as above and define a bilinear form B on L 2 by 
B{v,w) = {v,pi^w). Defining the associated quadratic form Q by (A.7), we see that 
(2.125) - (2.126) hold, from which we also conclude that B is independent of the 
choice of Lj D L 2 . If v and w are linearly dependent, a similar argument shows 
that B is independent of the choice of the subspace L 2 containing v and w. Hence 
B : H X H ^ C is well defined, and to conclude that it is a self-adjoint form we 
need to check that B{v,Xw + x) = XB{v,w) +B{v,x) for all v,xv,x G V, X € C, cf. 
Definition A. 1. If v, w, and x are linearly independent, this can be done by passing 
to the unique three-dimensional subspace L 3 C Lf containing these vectors. If they 
are not, we are already done by the previous step. Finally, given that Z? is a bilinear 
form, a self-adjoint operator p may be reconstructed from Proposition A.23, upon 
which (2.127) holds by construction. □ 

Proposition 2.31 again follows from two lemmas by modus ponens. 

Lemma 2.34. Any probability distribution on (vf. Definition 2.23) is continuous. 

Lemma 2.35. Any continuous probability distribution in satisfies (2.121), for 
some self-adjoint operator p. 

The operator p obtained by Lemma 2.35 is necessarily positive and automatically 
has unit trace. Another way to phrase this is to take the complex linear span of all 
probability distribution on the unit sphere Kj = in this yields a vector space 
whose elements are called frame functions. These are bounded functions 

f:S^^C, 

with the property that for any basis (u i, U 2 , U 3 ) of one has 

/(ui)-f/(u 2 )+/(u 3 )=w(/), (2.132) 
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where w{f) G C does not depend on the basis and is called the weight of the frame 
function /. For a probability distribution p we obviously have w{p) = 1. The natural 
norm on is the supremum-norm inherited from C(5^), and like the latter, 

is closed in this norm (and hence is a Banach space in its own right, a fact 
that will play an important technical role in Lemma 2.40 below). 

As for probability distributions, (2.132) implies a lemma that will often be used; 

Lemma 2.36. /f (ui, 112 ) is a basis of some two-dimensional linear subspace ofM?, 
then /(ui) +/(u 2 ) is independent of the choice of this pair. Hence ifC is some great 
circle in and ui _L U 2 for ui,U 2 G C, then /(ui) + /(U 2 ) only depends on C. 

Furthermore, by similar arguments any frame function is even, i.e., /(—u) = /(u). 

The proof of Lemma 2.34 will actually show that every frame function on 
is continuous, whilst the proof of Lemma 2.35 will establish the property that any 
continuous frame function on satisfies (2.127), for some self-adjoint operator p. 
Proof of Lemma 2.34. Let / : 5^ —>■ M be a frame function (the complex-valued case 
follows by decomposing / into a real and an imaginary part). Since constants are 
frame functions, adding a constant to / if necessary we may assume 

inf{/(x),xG52}=0. (2.133) 

Hence for given e > 0 there exists p G 5^ with 

/(p) < e/2. (2.134) 

Performing a rotation if necessary, we may assume that p = (0,0,1) is the north 
pole. It is useful to introduce another frame function g : ^ R+ by 

g{x)=fix)+f{R,in/2)x), (2.135) 

where R^{n/2) is the (counter-clockwise) rotation around the z-axis by an angle 

7r/2. It is easy to see that g is constant on the equator E: for x G £, consider the 

basis (x,/?j(7r/2)x,p) ofM^, sothatg(x) =w(/) —/(p) is independent of x. 
Furthermore, for any U CS'^ consider the oscillation of / at t/, defined by 

Oscf/ (/) = supy (/) - infj/ (/) = sup{/(u), u G t/} - inf{/(u), u G t/}. (2.136) 

If, for given x G for any e > 0 there is a neighbourhood t/ C 5^ of x on which 
Oscuif) < e, then |/(x) -/(u)| < e for all u G t/, so that / is continuous at x. 

The lengthier steps in the proof of Lemma 2.34 are now as follows: 

Lemma 2.37. Given that g(p) < £, there is an open set U G on which 

Oscj/(g) < 3e. 

Lemma 2.38. For any non-negative frame function h, if Oscy (h) < s' for some open 
U, then each point x £ has a neighborhood V where 

Oscv{h) < 4e'. 
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Assuming these lemmas (to be proved below), continuity of / easily follows: 

1. Lemmas 2.37 and 2.38 applied to h = g and x = p yield Oscy (g) < 12e for some 
neighbourhood V of p. Now g(p) < e, hence inf{g(v),v GV} < e, hence 

supy(/) < sup;/(g) < Oscy(g) -l-infy(g) < 13e. 

2. Since / > 0 and hence 0 < infv(/) < supy(/), this yields Oscv(/) < 13e. 

3. Applying Lemmas 2.38 to h = f and U = V gives that each point x G has a 
neighborhood W where Oscw(/) < 52e. 

4. Hence |/(x)-/(w)| < 52e for all w G IL. Since e > 0 was arbitrary, it follows 
that / is continuous at x, and since x was arbitrary, / is continuous on all of S^. 

For p 7 ^ u G A, i.e., the open northern hemisphere, let Cu be the unique great 
circle through u with one (and hence both) of the following equivalent properties: 

• the point of greatest latitude on Cu is u; 

• Cu cuts the equator E at two points that are both orthogonal to u. 

We write Du = Cu fl A, and for each z G A, we introduce the set 

DD^ = {xGA|3yGDx,zGDy}. (2.137) 

Geometrically, DD^ consists of the points x on the northern hemisphere from which 
z can be reached by “double descent”, where we say that y G A may be reached 
from some point x at higher latitude by (single) descent if y G Cx. The proof of our 
lemmas relies on the following two facts from spherical geometry (stated without 
proof, as they have nothing to do with frame functions, though the second is easy). 

Lemma 2.39. 7. The set DD^ in (2.137) has open interior. 

2. For any xG there exists y G E such that x lies on the equator Ey relative to y 
regarded as the north pole (so in this terminology, E = Ep). 

Proof of Lemma 2.37. By definition of the infimum, for each e > 0 there exists z G A 
such that 

infg < g(z) < infg + e. (2.138) 

N N 

The open U in question will be the interior of DD^. The crucial inequality is 

g(x) <g(z)+2e (xgDDz), (2.139) 

which together with (2.138) yields infA^g < g{x) < infArg + 3e for each x G DD^, 
whence OsC[/(g) < 3e. So we need to prove (2.139), given the assumption g(p) < e, 
which is immediate from (2.134) and (2.135). 

To prove (2.139), take r G A and s G Cr nC, so r _L s and hence 

g(r)-fg(s) < w(g). (2.140) 

Furthermore, take t,u G 7t, t _L u, so that (t,u,p) is a basis and, g being a frame 
function, we have 
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8{i)+8{»)+8{p)=w{8)- (2.141) 

But by construction g is constant on the equator E, so g(t) = g(u) = k, hence 2k + 
^(p) = (2.140) yields 

^(r) < w(g)-g(s) =2^ + g(p)-g(s) =^ + g(p), 


from which 

>-.g(p)- (2.142) 

Furthermore, for q G A^, x,r G Dq, x _L r, there exists q' G D^DE such that 

8{x) +g(r) = g(q) +g(q') = g(q) +^, 
from which, using (2.142), we obtain 

^(x) =g(q)+fc-g(r) >g(q)-g(p), 

and hence 

^(q) <^(x)+g(p),qGA^,xGDq. (2.143) 

Aplying this twice to the double descent definition domain (2.137), we find 

.g(x) < .g(y) +.g(p) < g(z) +2g(p), y G Dx,z G Dy. (2.144) 

Since (2.134) and (2.135) imply g(p) < e, this yields (2.139). □ 

Proof of Lemma 2.38. We may assume p G t/ = t/p. Using Lemma 2.39.2, by the 
argument to come we then move t/p to a neighborhood of y called t/y, and subse¬ 
quently repeat the argument so as to move t/y to t/x = V as specified in the lemma. 
We use spherical coordinates (0,0) for x = (x,y,z) G given by 

(x = COS0 sin0,y = sin0 sin0,z = cos 0), 0 G [0,2;r), 0 G [0, n]. (2.145) 

Hence the north pole p = (0,0,1) has 0=0 and 0 undefined (note that (0,0) are 
essentially (longitude, latitude), except that the latter usually starts counting down¬ 
wards from \n to with the north pole having latitude \n). Since U is open, 
there exists 5 > 0 such that all points with 0 < 0 < 5 belong to U. Pick y G £ as 
above, and define r as the point with the same 0 as y but 0r = Oy + ^d (so that r lies 
a little south of y). Then inspection of shows that one can find a neighborhood 
t/y of y with the following property: for any u G t/y there exists a great circle C 
through r and u that contains two further points r' G t/p and u' G t/p such that r _L r' 
and u _L u'. Hence h{r) +h{r') = h{u) + h{u'). Doing this for two different points 
u = ui and u = U 2 gives 


hir)+hir[) =/i(ui)-|-/r(u'i); 
h{r)+h{r'2) = h{u2)+h{u2). 

Hence /r(ui) — h{u 2 ) = h{r\) — ^(rj) — (/r(u)) — /r(u 2 )), from which we obtain 
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|/i(ui) -/i(u 2 )| < |/i(ri) -/i(r 2 )| -I- |(/i(ui) -/i(u 2 )| < Oscy (/i) -I- Oscy (/i) < 2e', 
for by assumption, Oscj/(/i) < e'. Since ui and U 2 in Uy were arbitrary, this gives 

Oscj/y(/i) <2e'. (2.146) 

Repeating this with y as the north pole gives Oscy^ {h) < 4e', i.e., the lemma. □ 

To prove Lemma 2.35, following Gleason himself we consider the natural action 
of the rotation group 5(9(3) (with positive determinant) on written R ; x Rx. 
This action maps onto itself and hence induces an action U on C(5^) by pullback; 

t/(R)/(u)=/(R-'u). (2.147) 

By Lemma 2.34 we have inclusions 

^{S^)cCe{S^)cC{S^), (2.148) 

where ^{S^) are the frame functions and Ce{S'^) consists of the even functions in 
C{S^)\ both spaces are obviously stable under the action (2.147). The following 
facts, due to Weyl, which we state without proof, follow from elementary represen¬ 
tation theory, but they are also quite easily verified by explicit computation. Let 

V/^(x,y,z) = + f G N, (2.149) 

and restrict this function to 5^, still calling it Xj/f. Let Hg C C(S^) be the vector space 
spanned by all transforms UR G 5(9(3). This vector space: 

• consists of all homogeneous polynomials of degree £ that are orthogonal (with 
respect to the inner product in L? (5^)) to any such polynomials of degree f — 2; 

• has a basis consisting of the spherical harmonics Y^, m = —£, 

• accordingly, has finite dimension equal to dim(//^) = 2 f + 1 ; 

• is irreducible under the natural 5(9(3)-action (2.147). 

Indeed, all (necessarily finite-dimensional) irreducible representations of 5(9(3) 
arise in this way. Now is closed under the 5(9(3)-action (2.147), hence so 

must be Since is irreducible, there are merely two possibilities: 

Hi c ^(5^); (2.150) 

HifX^{S^) = {Q}. (2.151) 

Since for even/odd values of £ the space Hi consist of even/odd functions, and 
^{S^) only has even elements, we immediately see that (2.151) applies if £ is odd. 
For even values of £, we see at once that (2.150) holds for: 

• £ = Q, where the constant frame function f{x,y,z) = c = jw)/) 0 is obviously 

induced by the operator p = c • I 3 (where I 3 is the 3x3 unit matrix), cf. (2.127); 

• £ = 2, which corresponds to frame functions / with weight w(/) = 0 . 
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The latter functions are induced by operators p with zero trace. To see this, diago¬ 
nalize p in as in (2.6), without the constraints on p, . This yields 

3 

/(x) = (x,px) = ^p;|(x,t;;)|^ (2.152) 

1=1 

For / G H 2 , since H 2 -L Hq in L^{S^) we must have 

(2.153) 

For any i) G C^, we have 

^^t/\|(x,t;)|2 = (2.154) 

to see this, write |(x, u)p = + Ifiyiy + use the surface element 

d^x = d^dOsinO associated to the spherical coordinates (2.145) to compute 

[d\x^= fd^xy^= [d\z^ = ^. (2.155) 

Therefore, from (2.152), noting that ||t>,|P = 1 for each i = 1,2,3, we obtain 
r . A-jr ^ Att 

d\f{x) = — '^Pi = —Ti{p). (2.156) 

Js2 3 3 

To settle the case f > 4, all we need to know about the spherical harmonics is 
that if £ is even, then, once again using spherical coordinates, one has 

Yp{x,y,z = 0) ~ (m even); (2.157) 

F“(x,y,z = 0) = O(modd). (2.158) 

If (2.150) holds, then T“ G for each m = —£, —£+ 1, ...,£— 1,£. But for any 

(even) f > 4, there are values of m for which T™ cannot be a frame function. To see 
this, take the following family of bases of indexed by 0: 

Ml = (cos0,sin^,O); (2.159) 

M 2 = (—sin^,cos^,0); (2.160) 

M3 = (0,0,1). (2.161) 

For any frame function /, the value of /(mi) + /(m 2 ) = w{f) — f{u 2 ) must therefore 
be independent of (j). However, from (2.157) - (2.158), we find 

If (mi) +Tf(M2) ~ e™'?' +e™('l’+V2) = +/««), 
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which is independent of 0 iff m = 0 or m = 2 (mod 4). For f = 0,2 these are indeed 
the only values that occur, but as soon as ^ > 4, the value m — 4 (among others) will 
ruin it. So (2.150) holds only for £ = Q and £ = 2, whereas (2.151) is the case for all 
other ^ G N. Since Hq and H 2 occur in C{S^) with multiplicity one, they cannot have 
greater multiplicity in C C{S^), so the above argument suggests that 

,^(52)=//o©//2, (2.162) 

which would prove the lemma. Fortunately, this is indeed the case, but to complete 
the argument we need the following technical results (left out by Gleason himself): 

Lemma 2.40. 1. Frame functions are uniformly continuous. 

2. The representation (2.147) of SO{2>) on is continuous (in the usual sense 

that the map {R,f) 1 —>■ U{R)f from 5(9(3) x to £^(8^) is continuous) with 

respect to the supremum-norm on £^(S^). 

3. A continuous representation of a compact group G on a Banach space B is com¬ 
pletely reducible (in that B is the closure of the direct sum of all irreducible 
representations of G that it contains). 

Proof. 1. The first claim follows because is compact. Another proof starts from 
the proof of Lemma 2.38, which has the feature that for given e' > 0, ify./GF 
with y = for some angle (j), then f/y = R^{^)Uy (this is immediately clear 
from the geometry). Similarly, as x G 5^, different neighborhoods V = t4 are 
related by a rotation. Hence the size of Ux is independent of x, so that the above 
proof of continuity established uniform continuity of frame functions also. 

2. Let/?„ ^ in 5(9(3) and/m —:►/uniformly in ,^(5^), i.e., ||/m —/||oc —>■ 0. Then, 
subtracting and adding a term U{Rn)f and using isometricity of U, i.e., 

\\U(Rn){U-f)\\o. = \\U-f\\oo, 

we obtain the estimate 

\\UiRn)fm - U(R)f\\^ < \\U - fWoo + \\U{Rn)f - U{R)fU 

cf. (2.147). As m — 00 the first term on the right-hand side vanishes by assump¬ 
tion, whilst the second vanishes as n —00 by uniform continuity of /. 

3. This is a Banach space version of the Peter-Weyl theorem, applied to the Banach 
space of frame functions equipped with the supremum-norm (see Notes). □ 

Something like this is necessary, because one needs to rule out the possibility that 
although (by the Stone-Weierstrass Theorem) the polynomial functions on re¬ 
stricted to 5^, are uniformly dense in C(5^), so that the linear span of all spherical 
harmonics and hence of all is uniformly dense in C(5^), some frame functions 
might lie in the closure of this direct sum (or, in other words, they are given by uni¬ 
formly convergent infinite sums of certain Yf‘). Lemma 2.40 clinches the proof of 
(2.162), since the third part implies that .£^(8^) would contain all irreducible repre¬ 
sentations that contribute to the potential infinite sums; but we have already proved 
that it only contains Hq and H 2 . Thus Lemma 2.35 now also follows. □ 
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2.9 Effects and Busch’s Theorem 

Gleason’s Theorem is easy to state but difficult to prove; Busch’s Theorem is a 
variation of it, which is more difficult to state but much easier to prove. Logically, 
Busch’s Theorem is weaker than Gleason’s, as the assumptions of the latter are con¬ 
tained in those of the former, but physically it appears to be more useful, as it covers 
more situations. To wit, Busch’s Theorem revolves around certain generalizations 
of projections (which took the centre stage in Gleason’s Theorem) called effects: 
these are (necessarily self-adjoint) operators a G B(H) that satisfy 0 < a < 1//, in 
the sense defined after Proposition A.22. Thus a G B{H) is an effect iff 

0 < {\l/,a\l/) < 1 {xj/G H). (2.163) 

The set of effects on a Hilbert space H is denoted by S’{H) or by [0, 1 ]b(//). By 
Theorem A. 10, we have (2.163) iff a* = a and the eigenvalues A of a lie in the 
interval [0,1] (i.e., a{a) C [0,1]). This implies that ||a|| < 1, and conversely, if a > 0, 
using the bound a < ||fl|| • Ih for any self-adjoint operator a, which easily follows 
from (A.47), we see that for a > 0, the condition ||fl|| < 1 is equivalent to a € S'iH). 
In particular, it follows that both projections and density operators are effects. 

Proposition 2.41. 1. The set S'{H) of effects on H is a compact convex subset of 
B{H) in its (7-weak topology, with extreme boundary 

deS{H) = ^{H), (2.164) 

i.e., the set of all projections on H (including 0). 

2. Each a G 'S'{H) has a (typically non-unique) extremal decomposition 

m 

a = Y.^ifi, (2.165) 

(=0 

in which ti > 0 and = 1> <^nd the f are projections. 

The ( 7 -weak topology on B{H), defined after Corollary A.31, is the right one in this 
context, but if H is finite-dimensional, as we assume here, this technicality may be 
ignored, as the claim is even true with respect to the norm topology. 

Proof. In Part 1, compactness and convexity are easily checked. 

The inclusion deS'(H) C ff^(H) is equivalent to the claim that any a G 
a ^(H), does not lie in deS(H) and hence admits a convex decomposition 

a = ta\-\- (1 —t)a 2 , t G (0, l),ai,fl 2 G S’(H),a\ ^ a fl 2 , (2.166) 

or, equivalently, a has a nontrivial decomposition a — for certain 6 >0 with 

Yiti = 1 . Indeed, the latter follows from the spectral resolution (A.37), in which the 
spectral projections ei should be rescaled if necessary to as to make the coefficients 
sum to unity (note that te G ^(H) for any projection e and any t G [0,1]). 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



72 


2 Quantum mechanics on a finite-dimensional Hilbert space 


To show the opposite inclusion 3^{H) C deS’{H), again assume (2.166), where 
this time a = e G 3^{H) is a projection. “Sandwiching” between xj/ G Hi, this yields 

(V/,aiV7) = {w,a2V} =0, Xj/G (2.167) 

{xi/,aixi/) = {xi/,a 2 XI/) = I, xj/G eH. (2.168) 

Using 0 < fl, < 1, i = 1,2, and (A.37), these equations imply that ai —02 = e. 

The claim of part 2 is satisfied by picking the f, and in terms of the spectral 
data associated to a (cf. Theorem A.10), as follows: with m = |o’(fl)|, order the 
eigenvalues X G ( 7 ( 0 ) according to < • • • < Am, and take: 


to = l-Am; (2.169) 

n =Xi; (2.170) 

ti = Ai-Ai-i(i>2); (2.171) 

fo = 0; (2.172) 

fl = 1 h; (2.173) 

m 

f = Y.<^xAi>X). (2.174) 

j=i 

The validity of (2.165) is then a trivial verification. □ 


Note that, in general, the extremal decomposition of a as an effect differs from its 
spectral resolutions (A.37) or (A.38) as a self-adjoint operator. If a = p is a den¬ 
sity operator, then the latter, i.e., (2.6), does provide an extremal decomposition of 
a construed as an effect also, which differs from the one in (2.165). This example 
shows that extremal decompositions in S'fH) are not necessarily unique. Also, ob¬ 
serve that te, for e G ^(H) and t G (0,1), does not lie in deS’{H), since it admits a 
nontrivial decomposition fe = te -f (1 — f) • 0, recalling that 0 G S^{H) C S'{H). 
Busch’s Theorem classifies the following objects. 

Definition 2.42. A probability distribution onS'{H) is a function p: S’{H) -g [0,1] 
that satisfies the following two conditions: 

1. p{Ih) = 1 ; 

2 . If a (finite) family (a,) of effects satisfies < Ih, then 



(2.175) 


Lemma 2.43. if a (finite) family (a,) of ejfects satisfies = 1> then = 1. 

This trivial observation implies that a probability distribution on SfR) induces a 
probability distribution on I^(H) C ^(H) by restriction, cf. Definition 2.23. An¬ 
other way to see this from the perspective of probability measures is to note that 
any family (e,) of projections that satisfies ^, e, < 1 is automatically orthogonal. 
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Therefore, restricted to Definition 2.42 reduces to Definition 2.23.2. To see 

this, fix j and pick \j/ G ejH. The condition Y,i < 1 gives 

^(V/,e,-V7)=^ lk;V/f < 0 , 

but since each term is positive, this implies e,= 0 for each i ^ j. Putting \j/ = ej(p, 
where (p G H is arbitrary, this gives eiejcp = 0 for all (p and hence = 0 . 

Clearly, any state (O on B{H) induces a probability distribution pa, on (^{H) by 

Pmia) = Co{a). (2.176) 

Busch’s Theorem shows the converse. 

Theorem 2.44. Any probability distribution p on S'(H) takes the form p = p^, for 
some state CO on B{H), establishing a bijective correspondence between probability 
distributions on S(H) and states on B{H). 

Proof If p : S{H) -G [0,1] can be extended to a linear map (O ; B{H) -G C, then 
CO is automatically a state, for normalization is assumed and positivity follows from 
the fact that any 0 f b>0 has the form b — ra for some r G and 0 < a < l/j, 
namely with r — ||fo|| and a = ^/||^||; then a > 0 and ||fl|| = 1, so that, as explained 
earlier, a is an effect. Hence co(b) = co(ra) = rp{a) > 0. To achieve this extension: 

1. We show that p{ra) = rp{a) for all r G Qfl [0,1] and 0 < a < Ih- Indeed, for 

any such a and n G N we write a = (a H- \-a)/n {n terms), so that by (2.175), 

p{a) = np{a/n). Similarly, for any m G N and 0 < b < In/cn, we have p{mb) = 
mp{b). Take integers m,n such that (m/n) G [0,1] and put b = ajn, so that 

p(-a)=mp(‘^)=-p(a). (2.177) 

\n J \nJ n 

2. We next prove that p{ta) = tp{a) for all t G [0,1] and 0< a < 1//. Positivity of p 
yields p(a) < p{a') whenever 0 < a < a' < Ih- Given t G [0,1], take an increas¬ 
ing sequences of rationals (r„) with r„ < t, as well as a decreasing sequence of 
rationals {s„) with t < Sn, such that f Sn it in M. With step 1, this gives 

r„p(a) = p(r„a) < p{ta) < p(s„fl) < s„p{a). 

Letting n —>■ oo, this gives tp{a) < p{ta) < tp{a), and hence equality. 

3. Now extend p to all a > 0, calling the extension co, by co{a) = ||fl||p(fl/||fl||) 

at a f 0 and ©(O) = 0; the previous step then easily yields the compatibility 
property = p and the scaling property co{ta) = tco{a) for each f > 0 . 

4. For a > 0 and b > 0, rescaling and (2.175) yield co{a + b) = Co{a) + Co{b). 

5. For general a* = a we write a = a+ — a-, with a± > 0, as in Proposition A.24, 

and define co on all of B{H)sa by co{a) = (»(«+) — a)(a_). This is well defined 
despite the lack of uniqueness of (A.74), for if a = a+ — = a'^ — a'_, with 

> 0 , then a+ +a'_ = a\ +«_, whence co{a+) — co(a^) = co(ai) — co(a'_). 
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This argument also shows that (O remains linear on general self-adjoint a and b, 
since a + b = (a+ +b+) — («_ is a decomposition with {a± -f ^±) > 0. 

6 . Finally, for general c G B{H) we (uniquely) decompose c = a + ib, a* =a,b* = b, 
cf. the proof of Corollary A.20, and put Ct)(c) = Co{a) + ico{b). □ 

To close, we give a very brief and superficial introduction to effects as they arise 
from modern (“operational”) quantum measurement theory. This theory associates 
quantum data to classical data through the concept of a Positive Operator Valued 
Measure or POVM. Relative to some given “classical” space X (taken finite here) 
and Hilbert space H (assumed finite-dimensional), a POVM is defined as a map 

A:^{X)^S’{H) (2.178) 

that satisfies A{X) = 1h as well as A{U UV) = A{U) -f A(y) whenever U (IV = 0, 
cf. Definition 1.1. Equivalently, a POVM is a map 

a:X-^^{H) (2.179) 

that satisfies 

E aW = 1 h- (2-180) 

xex 

As in the classical case, these notions are trivially equivalent through 

a(x) = A({x}); (2.181) 

A{U) = E (2-182) 

xeu 

The motivating special case of a POVM is given by some self-adjoint operator 
a G B{H), which yields X = a{a) and a(A) = ex- In that case, each density operator 
p induces a probability distribution on a{a) through the Born rule (2.8). More gen¬ 
erally, a probability distribution p on (^{H) and a POVM (2.179) jointly determine a 
probability distribution pa on X, given by 

Paix) = p{3{x)). (2.183) 

Indeed, pa{x) > 0 because a > 0, and Y.xexPai^) = 1 by (2.180) and Lemma 2.43. 
The idea, then, is that a measurement of some POVM a has (classical) outcome x 
with probability Pa(v); this generalizes the traditional dogma that a measurement 
of an observable a has outcome X G G{a) with (Bom) probability (2.8). Indeed, 
combined with (2.33), Busch’s Theorem shows that we necessarily have 

Pa(v) =Tr(pa(x)), (2.184) 

for some density operator p. So nothing has been gained by introducing Definition 
2.42, expect perhaps for the insight that, as in Gleason’s Theorem, it is the non- 
contextuality of a probability distribution on £’{H )—in that p{a{x)) is independent 
of the POVM a which a{x) forms part of—that eventually enforces (2.184). 
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2.10 The quantum logic of Birkhoff and von Neumann 

In §1.4 we showed that classical mechanics has a classical logical structure, in 
which (equivalence classes of) propositions correspond to subsets of phase space. 
These subsets form a Boolean lattice in which the logical connectives A, and V 
fornegation, disjunction, and conjunction, respectively, are interpreted as their nat¬ 
ural set-theoretic counterparts (i.e., complementation, intersection, and union). 

In 1936, Birkhoff and von Neumann proposed a strikingly similar quantum logic 
for quantum mechanics, in which (closed) linear subspaces of Hilbert space play 
the role of (measurable) subsets of phase space, and the basic logical connectives 
(except implication, which is queerly lacking in this setting) are interpreted as: 

^L = L^- (2.185) 

LAM = Lr\M-, (2.186) 

LVM = L + M, (2.187) 

where is the orthogonal complement of L, see (A.29), LHM is the (set-theoretic) 
intersection of L and M, and L + M is the (closed) linear span of L and M. If 
dim(//) < oo, as we continue to assume, any linear subspace of H is automatically 
closed, and the infinite-dimensional case an attractive operator-algebraic and lattice- 
theoretic structure arises only if the events are taken to be closed linear subspaces. 

Although the Brouwer-Hilbert debate on the foundations of mathematics had 
somewhat subsided in 1936, with hindsight it may be argued that the quantum 
logic of Birkhoff and von Neumann (who had been a “postdoc” avant la lettre with 
Hilbert) was predicated on their desire to preserve not only the law of contradiction 

aA-'a = _L, (2.188) 

where a is any proposition and _L is the proposition that is identically false, but also, 
against Brouwer, the law of excluded middle (or tertium non datur) 


aV-'a = T, (2.189) 

where T is the proposition that is identically true. Indeed, in the Birkhoff-von Neu¬ 
mann model (2.185) - (2.187), where _L = {0} and T = H, these are identities. 
Similarly, their model satisfies the law of double negation 


= ct, (2.190) 

which both in classical logic (where it is a tautology) and in intuitionistic logic 
(where it is rejected in general) is equivalent to (2.189). Also, De Morgan’s Laws: 

“'(ctVjS) = “iCtA “'jS; (2.191) 

“■(ct A j3) = “lO! V “ij3, (2.192) 

hold in their quantum logic (despite their origin in classical propositional logic). 
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We will now derive the Birkhoff-von Neumann structure along similar lines as its 
classical counterpart (cf. §1.4), except that in the absence of the necessary structure 
for a classical propositional calculus we now rely on semantic entailment alone. 

In quantum theory, the role of functions f : X ^ R as observables in classical 
physics is played by self-adjoint operators a: H H on some Hilbert space H, and 
hence the quantum analogue of an elementary proposition f G A of classical physics 
isa G A (where A C K), with special case a = k for a G {X} (with X G M). 

In analogy to the points x G X of phase space, pure states as in (2.42), or 
the corresponding density operators e,/, (where y/ G H is a unit vector), yield truth 
assignments to elementary propositions. To start with the simplest case, a = X is: 

• true with respect to oty, iff pI(X) = 1, see (2.10), or, equivalently, iff yr G Hx, 
where HxQH is the eigenspace of a for eigenvalue X, cf. (A.36); 

• false with respect to iff pj (A) =0, or, equivalently, iff yf XHx- 

The underlying idea here is arguably that, according to some naive operational in¬ 
terpretation of quantum mechanics, a measurement of a in a state (Oy, would give 
outcome X with probability one (zero) iffa = X is true (false) with respect to oty,. If 
0 < pj (A) < 1, the “truthmaker” actually/a/fa to assign a truth value to a = A; 
the partial nature of truthmakers marks a significant difference with the classical 
case, as does the closely related distinction between false and not true. Similarly, 
we say that an elementary proposition a G A is true in some state (0^^f iff 


Pf{A) = \\e^yf\\^ = \, (2.193) 

cf. (2.9) and (A.42), and false if Pj(A) =0. In other words, a G A is true in (Oxf, 
iff y/ G H^, and false if y/ _L H^,see (A.43). Such propositions may formally be 
combined using the connectives -i. A, and V (whose meaning is unfortunately far 
from clear in this new setting) according to the same (inductive) formation rules as 
in classical propositional logic. However, the classical truth tables for A and V are 
unsound with regard to the above rules, at least if one eventually wants to arrive at 
(2.185) - (2.187). For example, (Oxf/ may validate neither a nor j3, yet it might make 
a V j3 true (assuming that a and j3 correspond to L and M, respectively, this is the 
case ifyt^L and y/ ^M, yet y/ G L+M). Similarly, (Ox/f may render neither a nor j3 
false, yet it may falsify a A j3. Due to this complication, the approach of §1.4 has to 
be modified, as follows. Our goal remains to define a semantic equivalence relation 
which is predicated on an inductive definition of truth we first give. 

Definition 2.45. 1. a G A is true in tOx^ iffPl (A) = 1, and false ifPj (A) = 0. 

2. The negation -'{a G A) of an elementary proposition a G A is given by a G A'^. 

3. The negation -^a is true iff a is false. 

4. The conjunction a Afi is true iff both a and j3 are true. 

5. De Morgan’s Laws (2.191) - (2.192) and the law of double negation (2.190) hold; 
in particular, the disjunction a V j3 is true iff -'(-'a A -ij3) is true (as per 1-4). 

6. We write a \=h iff the truth of (X implies the truth offffor each state 0)^. 

7. We write a P iff a \=h P and j3 \=h Ct. 

8. If (X p, then —<(X ~<p. 
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Lemma 2.46. Definition 2.45 implies the following rules: 

1. Our earlier truth attributions for the case aG A with A = {A}. In particular, 
a = X is always false when X fz. <j{a), and so is a G A whenever A fl (7(a) = 0. 

2. a G A is false relative to (O^r iff \j/ 1. H^. 

3. {a G A) A {b G r) is true in (0^, iffljfG fl Hp \ 

4. {a G A)\/ {b G r) is true in (Oxf, ifflj/^ ^4 +Hp\ 

Hence conjunctions behave classically, as part 3 states that {a G A) A {b G F) is true 
iff a G A and G F are true). The proof of this lemma uses the following notation. 

Definition 2.47. Ife and f are projections on a Hilbert space H, then: 

• e A f is the projection onto eH D fH; 

• eW f is the projection onto eH + fH, i. e., the (closed) linear span of eH and fH. 
Note that if e and / commute, these reduce to the algebraic expressions 

eAf = ef- (2.194) 

eyf = e + f-ef. (2.195) 

Furthermore, in case of potential ambiguity we will write e^f^ for the spectral pro¬ 
jection eA as defined by a, and analogously etc. Similarly for H^^'^ etc. 

Proof. The first and third claims are immediate. The second one follows from the 
relation = 1 — or, equivalently, H^c = H^. For the fourth, use Defini¬ 

tion 2.45.6, 3, and 2 to infer that {a G A)\/ {b G F) is true iff (a G A‘^) A{b G F‘^) is 
false. From the third claim, we note that 

{aGA)A{bG F) {e^f Ae''^ = l) , (2.196) 

so by Definition 2.45.5, (a G A‘^) A{b G F'^) is false iff e^^} A e^} = 1 is false. Since 

A = 1 is true iff y/ G H^} fl , claim 2 implies e^^} A = 1 is false iff 

V/G (//if = ((//i^y n(//f'V)-L = (//i^V-L + (//f’V-L =//W-P//W, 

which finishes the proof. □ 

Quite analogously to the classical case. Definition 2.45 implies 

(a G A) {b G F) iffy C ef, (2.197) 

which, once again, immediately yields {a G A) {b G F) iff = e^K Taking 
b = and F = {1}, analogously to (1.53), as in the above proof we have 

aG4~//y = l. (2.198) 
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Furthermore, as in the proof of Lemma 2.46 we find 

ia€A)A{b€r) [e‘'fAeP = iy, (2.199) 

(a G 4) V (fo G r) (4“^ V 4*^ = l) ■ (2.200) 

Consequently, we have the following counterpart of Lemma 1.19; 

Lemma 2.48. Any elementary or composite proposition is semantically equivalent 
(relative to H) to one of the form e = I, for some projection e. Furthermore, 

-,(e = 1) -./f = 1^ ; 

(e=l)A(f=l)^H (eAf=l); 

(e=l)V(f=l)^H (eVf=l). 

At last, the quantum version of Theorem 1.20 reads as follows: 

Theorem 2.49. The set ^{Fl) of equivalence classes [•]// of propositions generated 
by the elementary propositions a G A and the logical connectives V, and A, is 
isomorphic to the set Af{H) of linear subspaces of FI, under the map 

(p : ^{H) 4 (2.204) 

(pi[a G A]h) = e^f'H. (2.205) 

Under this isomorphism, the logical connectives A and V turn into orthogonal 
complementation (—)^, intersection fl, and linear span +, respectively, in that 

^(h«]//) = (pi[a]x)^; (2.206) 

(p{[aAl3]H) = (pi[a]H)r)(p{[P]H-, (2.207) 

(p{[a\/ I5 ]h) = (p{[a]H) + (p{[I5]h), (2.208) 

Furthermore, if we define a partial order < on cS(A) by saying that [(x]h < iff 
(X \=H j3 (which is well defined), then (p maps < into set-theoretic inclusion C, i.e., 

[(x]h < mn iff(p{[a]H) c (p([j3]^). (2.209) 

With respect to these operations, f£(H) is a modular lattice ( granted that dim(//) < 
otherwise, the lattice is merely orthomodular, cf ^D.l for terminology). 

Proof Most of this is immediate from Lemma 2.48, expect for the last claim, which 
follows from simple computations (and from the Amemiya-Araki Theorem). □ 

As in the classical case, there is an algebraic reformulation of this result, obtained 
from the bijective correspondence between (closed) linear subspaces L of H and 
projections e on H, given by L = eH (see Proposition A.8). 


( 2 . 201 ) 

( 2 . 202 ) 

(2.203) 
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Theorem 2.50. The set of equivalence classes [•]// of propositions generated 

by the elementary propositions a G A and the logical connectives V, and A, is 
isomorphic to the set I^{H) of projections on H, under the map 


cp' : ^{H) 4 ( 2 . 210 ) 

(p\[a€A]H) = e^^\ ( 2 . 211 ) 

where (once again) ^P{H) is the set of all projections on H. 

Under this map, the logical connectives A and V turn into (cf. Definition 2.47): 

(p'{[^a]H) = l-(p'i[a]x) (2.212) 

^'([aAjS]//) = ^'([a]//)A^'([j3]//); (2.213) 

^'([« V j3]//) = (p\[a]H) V (p'iiPjH), (2.214) 


Furthermore, (p' maps the partial order < on 4S{H) into the partial order on I^{H) 
defined by e < f ijf eF[ C ///, or equivalently, iff ef = e. 

Finally, with respect to these operations, ff^{F[) is an (ortho)modular lattice. 

However, unlike (1.65) - (1.68), this result is somewhat unsatisfactory in not being 
purely algebraic. This may partly be remedied through expressions like 

eA/= lim(eo/)"; (2.215) 

n^oo 

eV/= l-((l-e)A(l-/)), (2.216) 

where eof = ef+fe, and the (strong) limit in (2.215) should be taken on fixed vec¬ 
tors xj/ G H (upon which it exists in the norm-topology of H). Even so, this specific 
limit still relies on the underlying Hilbert space, and in any case the expressions fail 
to be purely algebraic and look pretty artificial. Indeed, the same may be said about 
Definition 2.45, which, of course, has been fine-tuned with hindsight in order to ob¬ 
tain the “desired” answer in the form of Theorem 1.20, which in turn vindicates the 
mathematically sweet Birkhoff-von Neumann Annate (2.185) - (2.187). 

In addition, there are serious conceptual objections to this kind of quantum logic: 

1. Conjunction A and disjunction V do not distribute over each other, rendering their 
interpretation as “and” and “or” obscure. 

2. There are propositions a and j3 (namely those for which <p'{[a]H) and 
do not commute) for which the conjunction a A j3 is physically undefined. 

3. There are states in which a V j3 is true whilst neither a nor j3 is true. 

4. There are states in which a A j3 is false whilst neither a nor j3 is false. 

5. In view of Schrddinger’s Cat, one would expect the law of excluded middle 
(2.189) to fail in quantum mechanics, yet it holds in quantum logic (and this 
is possible because neither V nor ^ has any familiar logical meaning in it). 

6 . Finally, nothing is said or done about propositions that are neither true nor false. 

In Chapter 12, we will therefore replace the doomed quantum logic of Birkhoff and 
von Neumann by the intuitionistic logic of Brouwer and Heyting. 
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Notes 

All operator theory for this chapter may be found in Kadison & Ringrose (1983). 
§2.1. Quantum probability theory and the Bom rule 

The Bom mle was first stated by Born (1926b) in the context of scattering the¬ 
ory, following the earlier paper (Born, 1926a) in which Born omitted the absolute 
value squared signs (corrected in a footnote added in proof). The application to the 
position operator is due to Pauli (1927), who merely spent a footnote on it. The gen¬ 
eral formulation is due to von Neumann (1932, §lll), following earlier contributions 
by Dirac (1926b) and Jordan (1927). Both Born and Heisenberg acknowledge the 
profound influence of Einstein on the probabilistic formulation of quantum mechan¬ 
ics. However, Born and Heisenberg as well as Bohr, Dirac, Jordan, Pauli and von 
Neumann differed with Einstein about the fundamental nature of the Born probabil¬ 
ities and hence on the issue of determinism. Indeed, whereas Bom and the others 
just listed after him believed the outcome of any individual quantum measurement 
to be unpredictable in principle, Einstein felt this unpredictability was just caused 
by the incompleteness of quantum mechanics (as he saw it). See, for example, the 
invaluable correspondence between Einstein and Born (2005). 

Mehra & Rechenberg (2000) provide a very detailed reconstmction of the histor¬ 
ical origin of the Born rule within the context of quantum mechanics, whereas von 
Plato (1994) embeds a briefer historical treatment of it into the more general setting 
of the emergence of modern probability theory and probabilistic thinking. Eor the 
earlier history of probability see Hacking (1975, 1990). See also Landsman (2009). 

§2.2. Quantum observables and states 

Proposition 2.10 is due to von Neumann; see also Chapter 6. 

§2.3. Pure states in quantum mechanics 

This kind of thinking goes back to von Neumann (1932) and Segal (1947ab). 
§2.4. The GNS -construction for matrices 

Again, see §C.12 for the GNS-construction in general. 

§2.5. The Bom rule from Bohrification 
See notes to §4.1. 

§2.6. The Kadison-Singer Problem 

The Kadison-Singer Problem was first discussed in Kadison & Singer (1959). 
See the Notes to §4.3 for more information. 

§2.7. Gleason’s Theorem 
§2.8. Proof of Gleason’s Theorem 

Gleason’s Theorem is due to Gleason (1957), whose proof we largely follow, 
with some simplifications due to Varadarajan (1985) and Hamhalter (2004). Lemma 
2.40.3 or some analogous result is lacking from these references; it may be found 
in Lyubich (1988), Chapter 4, §2, Theorem. It is often claimed that Gleason’s proof 
has been superseded by the more elementary one due to Cooke, Keane, & Moran 
(1985), which avoids all use of harmonic analysis. A similar proof, following up on 
Cooke et al but using constructive analysis only, was given by Richman & Bridges 
(1999). However, both because Gleason’s use of rotation invariance is very natural. 
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and also since the proof of Cooke et al has already been presented and simplified in 
two monographs entirely devoted to Gleason’s Theorem, viz. Dvurecenskij (1993) 
and Hamhalter (2004), as well as in the highly efficient book by Kalmbach (1998), 
we prefer to return to the original source (and add some technical details). 

§2.9. Effects and Busch’s Theorem 

Busch’s Theorem is from Busch (2003), whose proof we follow almost verbatim. 
See also Caves et al (2004). For the use of POVM’s in quantum physics see, e.g., 
Busch, Grabowski, & Lahti (1998), Davies (1976), Holevo (1982), Kraus (1983), 
Landsman (1998a, 1999), de Muynck (2002), and Schroeck (1996). 

§2.10. The quantum logic of Birkhoff and von Neumann Our discussion is based 
on Redei (1998), with some modifications though. The original source is Birkhoff 
& von Neumann (1936). 
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Chapter 3 

Classical physics on a general phase space 


Passing from finite phase spaces X to infinite ones yields many fascinating new phe¬ 
nomena, some of which even seem genuinely “emergent” in not having any finite¬ 
dimensional shadow, approximate or otherwise. Nonetheless, practically all results 
in the previous chapter remain valid, typically after the inclusion of some technical 
condition(s) that restrict the almost unlimited freedom allowed by infinite sets. 

One of these restrictions is that in classical physics we assume that our phase 
space X is locally compact Hausdorjf, where we recall that a space is: 

• compact if every open cover has a finite subcover; 

• locally compact if every point has a compact neighbourhood; 

• Hausdorff (or T 2 ) if every pair of distinct points x,y can be separated by open 
sets (i.e., there are disjoint open sets Ux, Uy that contain x and y, respectively). 

This combination of topological properties turns out to be very convenient; it in¬ 
corporates spaces like (and more generally all non-pathological manifolds), or 
lattices like Z" (the price is that we exclude systems with an infinite number of 
degrees of freedom, such as classical field theories). A locally compact Hausdorff 
space X is regular in that each x G X and each closed set F C X not containing x 
can be separated by open sets (i.e., there are disjoint open sets 9 x and Uf F). 

From the perspective of C*-algebras, the main advantage of using this particular 
class of spaces is that they are naturally singled out by Gelfand’s Theorem: 

Theorem 3.1. Every commutative C*-algebra A is isomorphic to Cq{X) for some 
locally compact Hausdorff space X, which is unique up to homeomorphism. 

A proof may be found in Appendix C; here we just explain the notation and the 
main idea behind the proof (cf. Definition C.l, which we do not repeat). 

First, Co (A) is the set of all continuous functions f :X ^ C that vanish at infin¬ 
ity, i.e., for any e > 0 the set {xGX \ \f{x) | > e} is compact, or, equivalently, for any 
e > 0 there is a compact set A C A such that |/(x)| < e for all x f: K. For example, 
if A = M, then f{x) = exp(—x^) lies in Co(]R). If A is compact, then Co(A) = C(A). 

Second, Co (A) is a vector space under pointwise operations (including pointwise 
complex conjugation as the involution), and is a Banach space in the sup-norm 
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||/||oc = SUp{|/(x)|}. (3.1) 

The space X making A isomorphic to Co{X), then, is the Gelfand spectrum X (A) of 
A, which we already encountered (cf. Definition 1.4) as the set of nonzero algebra 
homomorphisms from A to C. This set turns out to be a locally compact Hausdorff 
space in the topology of pointwise convergence, and the isomorphism A —^Co(X) is 
the Gelfand transform a d, where d((o) — Co(a). Conversely, if X is given, then 
we associate the commutative C*-algebra Co(X) to it, as in Chapter 1. 

Generalizing Definition 1.14, as a special case of the notion of a state we have: 

Definition 3.2. A state on Co(X) is a positive (and hence bounded) linear functional 
CO : Co{X) —>■ C with ||a)|| = 1. 

If X is compact, given positivity one has ||a)|| = 1 iff CO (lx) = 1, cf. Lemma C.4. 
The appropriate generalization of Theorem 1.15 then reads (cf. Corollary B.21): 

Theorem 3.3. Let X be a locally compact Hausdorff space. There is a bijective cor¬ 
respondence between states on Cq{X) and probability measures on X, namely 


(p(f)= f dpf, feCoiX). (3.2) 

Jx 

Moreover, pure states correspond to Dirac measures and hence to points ofX. 

In particular, a nonzero linear functional co : Co(2f ) — >^ C is multiplicative iff it is a 
pure state. This recovery of probability measures on phase space as states of the as¬ 
sociated algebra of observables Cq{X), and of points in phase space as the associated 
pure states, already familiar from the finite case, remains of great importance. 

As in quantum mechanics, many interesting observables in classical mechanics 
fail to be bounded, let alone Co; coordinate functions (on non-compact phase spaces) 
and the usual kinetic energy are a case in point. This is not a serious problem, es¬ 
pecially not if, as we shall assume from now on, 2f is a (smooth) manifold (those 
unfamiliar with this notion may always have X = in mind). In that case, there is a 
very natural class of (typically unbounded) functions onX, viz. C°°{X) = C“’(2f,R), 
which form a commutative algebra just like Co{X) = Co(2f ,C), and provide the (al¬ 
gebraic) basis for the theory of symmetry and dynamics in classical physics, as we 
shall now show (the fact that functions in C°° {X) may be freely added and multiplied 
provides a major simplification compared to unbounded operators in quantum me¬ 
chanics, even self-adjoint ones, which are most easily treated by transforming them 
into bounded ones, as discussed in §B.21). In fact, the most natural mathematical 
setting of classical physics is not operator theory, or even symplectic geometry (as 
even mathematically minded people used to think until the 1980s), but rather the 
more general and flexible framework of Poisson geometry, to which we now turn. 
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3.1 Vector fields and their flows 

We do not assume familiarity with differential geometry and analysis on manifolds, 
so in what follows one may assume that M = for some k. However, whenever 
possible we will phrase definitions and results in such a way that their more general 
meaning should be clear to those who are familiar with differential geometry etc. 
An old-fashioned vector field on X = is a map 

^ : R* R*; (3.3) 

= (3.4) 

which describes something like a hyper-arrow at x. However, this is a coordinate- 
dependent object, which is hard to generalize to arbitrary manifolds. Therefore, in a 
modem approach a vector field is seen as the corresponding first-order differential 
operator ^ : C°°(X) ^ C°°(X) defined by 

= (3.5) 

To make the idea precise that a vector field on X is essentially the same as a first- 
order differential operator on C°°{X), we note that it easily follows from (3.5) that 

^ifg)=^if)g + f^ig), (3-6) 

for any f,gG C°°{X), where the product fg is defined pointwise, i.e., 

{fg){x)=fix)g{x). (3.7) 

Similarly, we have pointwise addition and scalar multiplication, i.e., for s,f G R, 

isf + tg)ix)=sf{x)+tg{x). (3.8) 

This turns C°°(X) into a commutative algebra (over R, as C°°{X) = C°°(X,R). 

A derivation of an algebra A (over R) is a linear map 5 :A^ A satisfying 

5 {ab) = 5 {a)b + a5{b). (3.9) 

Thus any vector field on X defines a derivation of the algebra C°°{X) by (3.5). Con¬ 
versely, a deep theorem of differential geometry states that for any manifold X, each 
derivation of C°°{X) takes the form (3.5), at least locally (and for X = R* also glob¬ 
ally). Therefore, either as a definition or as a theorem, we often simply identify 
vector fields on X with derivations of (7°{X). Derivations have a rich structure; 

Definition 3.4. A ( real) Lie algebra is a ( real) vector space equipped with a bilinear 
map [•, •] : A X A —>■ A that satisfies {a,b\ = — [b^a] (and hence [a,a] =0) as well as 

[a, -I- [c, [afi\\ + [b, [c,(a!]] = 0 (Jacobi identity). (3.10) 
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It is easy to see that the set Vec(X) of all old-fashioned vector fields ^ on X (i.e. 
in the sense (3.5)) forms a real Lie algebra under pointwise vector space operations 
(i.e., (s^ +trj){f) = s^f + trjf) and the natural bracket 

[^,ri] = ^ri-ri^. (3.11) 

Similarly, the set Der(A) of all derivations on some algebra is a Lie algebra under 
pointwise vector space operations and Lie bracket 

[5i, 52] = 5i 0^2-52 o5i. (3.12) 

Of course, the identification of Vec(X) with Der(C°°(X)) identifies (3.11) and (3.12). 

Vector fields (or, equivalently, derivations) may be “integrated”, at least locally, 
in the following sense. First, a curve through xq € V is a smooth map c : I ^ X, 
where / C M is open and c(fo) = xq for some to G I. We usually assume that 0 € / 
with to = 0 and hence c(0) = xq. We then say that c integrates ^ near xq if 

c{t) = ^{c{t)), (3.13) 


a somewhat symbolic equality that can be interpreted in two equivalent ways: 

• Describing c : / —>■ by k functions cL / —>■ K (7 = 1,... ,A:), eq. (3.13) denotes 

j = \,...,k. (3.14) 

• More abstractly, eq. (3.13) means that for any / G C°°(V) we have 


To pass from (3.15) to (3.14), we just have to recall (3.5), and note that 


d 

dt'' 


. 7=1 


dt 


dxj 


(3.15) 


(3.16) 


The theory of ordinary differential equations shows that such local integral curves 
exist near any point xq G X, and that they are unique in the following sense: if two 
curves ci : /i —>■ V and C 2 '-h^^ both satisfy (3.13) with ci (0) = C2(0) = xq, then 
Cl = C 2 on I\ n/ 2 . However, curves that integrate ^ near some point may not be 
defined for all f, i.e., for / = K. This makes the concept of a flow of a vector field <*, 
which is meant to encapsulate all integral curves of ^, a bit complicated. We start 
with the simplest case. We say that a vector field is complete if for any xq G V 
there is a curve c : K —>■ V satisfying (3.13) with c(0) = xq. The simplest example 
of a complete vector field is V = R and ^ = d/dx, so that (pt{x) = x + f. For an 
incomplete example, take V = R and ^ (x) = x^djdx. It can be shown that a vector 
field ^ with compact support (in the sense that the set {x G V | ^ (x) ^ 0} is bounded) 
is complete. In particular, any vector field on a compact manifold is complete. 
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Definition 3.5. Let X be a manifold and let ^ G Vec(X) be a complete vector field. 
A flow of ^ is a smooth map (p :M.xX ^X, written 


(Pr{x) = (p{t,x), 


that satisfies 


(f\){x) = x; 
tPsO (Pt = (Ps+t, 


(3.17) 


(3.18) 

(3.19) 


and that integrates ^ is the sense that for each f G M and x GX, 

^{(p,{x)) = j^(p,{x). (3.20) 

As before, eq. (3.20) by definition means that for each / G C“’(2f) we have 

= ^f{(pt{x)), (3.21) 

at 

or, equivalently, that in local coordinates, where 

^t{x) = (x),..., (x)), (3.22) 

we have 

^^=4'(.p,W),t=l .(3.23) 

dt 

Indeed, the flow <p of ^ gives the integral curve c of ^ through xo by 

c(f) = ^,(xo). (3.24) 

According to the Picard-Lindelof Theorem in the theory of ordinary differential 
equations, any complete vector field has a unique flow. In fact, the uniqueness part 
of this theorem implies that (3.19) is a consequence of (3.20) with (3.18), but it 
is convenient to state (3.19) separately, so as to make the point that the flow of a 
complete vector field ^ on X is a smooth K-action on X, as defined by conditions 
(3.18) - (3.19), whose orbits integrate In particular, each (pi :X ^X is invertible, 
with inverse (pf^ = <p^f In particular, A is a disjoint union of the integral curves of 
^, which can never cross each other because of the uniqueness of the solution of the 
initial-value problem (3.13) with c(0) = xq). 

If (® is not complete, we do the best we can by defining the set 

={(f,x) gMxA I 3c:/->A,c(0) =x,f G/} cMxA, (3.25) 

where it is understood that c satisfies (3.13). Obviously {0} x A C D^, and (less 
trivially) it turns out that is open. Then a flow of is a map (p : X that 

satisfies (3.18) for allx, eq. (3.21) for (f,x) G as well as (3.19) whenever defined. 
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3.2 Poisson brackets and Hamiltonian vector fields 


To obtain flows, classical mechanics requires more than a manifold structure: 

Definition 3.6. A Poisson bracket on a manifold X is a Lie bracket { —, — } on (the 
real vector space) C°°{X), such that for each h G C°°{X) the map 

^h-f^{hj} (3.26) 

is a vector field on X (or, equivalently, a derivation o/C“’(X,K) with respect to 
its structure of a commutative algebra under pointwise multiplication). A manifold 
X equipped with a Poisson bracket is called a Poisson manifold, (C“’(X),{, }) is 
called a Poisson algebra, and is called the Hamiltonian vector field ofh. 

Unfolding, we have a bilinear map { — C°°(X) x C“’(X) —>■ C“’(X) that satisfies 

{gj} = (3.27) 

{f.isM} + {K{f,g}}P{g.{hJ}}=0-, (3.28) 

{f,gh} = {f.g}h + g{fM- (3.29) 

Bilinearity and the abstract properties (3.27) - (3.29) imply: 

Proposition 3.7. Each Poisson bracket on X defines a Lie algebra homomorphism 


C“’(X) Der(C~(X)); 

hv-^5h, 

or, equivalently, a Lie algebra homomorphism 

C°°(X) Vec(X); 
h 


(3.30) 

(3.31) 


(3.32) 

(3.33) 


The time-honored example is X = K^", with coordinates x = {p,q) and bracket 

(3.34) 


{f^g} = E 

j=i 


df dg df dg 
dpj dqi dqj dpj 


In that case, the Hamiltonian vector field of h is obviously given by 


^h = Y. 

./=i 


dh d dh d 
dpj dqi dqj dpj 


(3.35) 


The flow of gives the motion of a system with Hamiltonian h. Writing 

<Ptip,q) = 


we see from (3.23) that this flow is given by Hamilton’s equations 
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dPjjt) ^ dh{p{t),q{t)) _ 

dt dqj 

dq-'jt) ^ dh{p{t),q{t)) 
dt dpj 

Hamiltonians of the special form 

Kpa) = i^+y{q), 

2m 

where p^ = Y,jp], give Newton’s equation “F = ma”, where Fj = —dV / dq^, viz. 

Fj{q{t))=m2-^j^. (3.39) 

Proposition 3.8. For any vector field ^ on a manifold X, we say that a function 
f G C°°{X) is conserved if f is constant along the flow of If X is a Poisson 
manifold and ^ is Hamiltonian, then f is conserved iff{h,f} = 0. 

The proof is trivial. A Poisson bracket on X may also be defined in terms of a Pois¬ 
son tensor. In coordinates, this is just an anti-symmetric matrix B‘j{x) that satisfies 


(3.36) 

(3.37) 

(3.38) 


E 

I 


B“ 




dxi 


+ B‘- 


■ dB’’^ 
dxi 




= 0 , 


(3.40) 


for each {i,j,k). In terms of B, the Poisson bracket is then defined abstractly by 


{f,8}=B{df,dg), 


(3.41) 


using standard notation of differential geometry, or, in coordinates, by 


{f,8}ix)='£Bd{x) 


df{x) dg{x) 
dx' dxi 


(3.42) 


Conversely, a Poisson bracket must come from a Poisson tensor: for any derivation 
5 on C°°{X), the function 5(g) depends linearly on dg, so if 5f{g) = {/,g}, then 
5f(g) = —dg(f), so that {f,g} depends linearly on both df and dg. This enforces 
(3.42), upon which (3.41) implies (3.40). A nice example is X = with 


B‘J(x) = 


k 


dzdyj ^{dzdx dxdzj \dxdy 


dy dx J " 
(3.43) 


Finally, we say that a Poisson manifold is symplectic if the corresponding Poisson 
tensor B(x) is given by an invertible matrix, for each x G X. This requires X to be 
even-dimensional. For example, with Poisson bracket (3.34) is symplectic. 
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3.3 Symmetries of Poisson manifolds 


Two equivalent notions of symmetries of classical physics suggest themselves: one 
is based on the idea of a Poisson manifold {X,B), the other comes from the equiva¬ 
lent notion of a Poisson algebra (C°°(X), {, }). 

Definition 3.9. 7. A symmetry of a Poisson manifold {X,B) is a diffeomorphism 
(p : X ^ X (that is, an invertible smooth map with smooth inverse) satisfying 


(p,B^B. (3.44) 

2. A symmetry of a Poisson algebra (C°°(7f), { , }) is an invertible linear map 
a : C°°{X) —>■ C°°{X) that satisfies (for each f,g€ C’^(X)): 

a{fg) = a(f)a{g)- (3.45) 

a{{f,g}) = {a{f)Mg)}- (3.46) 

Let us define the push-forward in (3.44). We do this in terms of the pullback (p* 
of a smooth (i.e., infinitely often differentiable) map (p :X -^X, defined as 

(p* : C°°{X) C~(X); (3.47) 

(p*f = fo(p. (3.48) 


If ^ is a diffeomorphism, the push-forward (p^ of (p, which acts on derivations, is 

(p* : Der(C“(X)) ^ Der(C“’(X)); (3.49) 

((p,5)(f) = 5i(p*f)o(p-^-, (3.50) 

this may be checked to define a derivation, as follows: 

{(p*s){f8) = {(p-'rs{(p*{f-g)) 

= (<p-^)*5(<p*(f)<p*(g)) 

= {(p-^)*i5{(p*{f))(p*{g)+(p*{m(p*{g))) 

= {(P*S){f)-g + f-{(p*5){g). 


If, given coordinates x = (x',...,x^) on X, we now (without loss of general¬ 
ity) take our derivation 5 to be a vector field ^ jdxf and write (p{x) = 

{(p^ (x),... ,(p‘ (xj), for the image ^*(<^) we obtain 




{^(tp*f))(tp-\x)) 


l^i\tp-\x)) 

j.k 


df{x) d^J 
dx-i dx^ 


{(P ‘(^))> 
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SO that 

(p,^j{x) = '£^{(p-\x))^’‘{(p-\x)), (3.51) 

or, equivalently, 

(3.52) 

which only depends on ^ (x), so that for each x GX, (p^: may be localized to a linear 
map (p^:{x) : TxX -G- This may be done even if (p is not invertible. Physicists 

often write this as (p{x) = y = y(x*,... ,x^), ^ = v, (p^,^ = v', so that we have a 

“covariant” transformation rule (v')'(y) = T!j=i 

Taking tensor products, one obtains similar rules for higher-order tensors. For 
example, if = X, the transformation rule for the Poisson tensor B reads 


(p,B‘j{(p{x)) 


k 


E 

m,n=\ 


d(p‘{x) d(pj{x) 
dx^ 




so that, in coordinates, the invariance requirement (3.44) reads 


k 


E 

m,n=l 


d(p‘{x) d(pj{x) 
dx"‘ dx" 


B™(x)=B'-''((p(x)). 


(3.53) 


(3.54) 


Theorem 3.10. The two parts of Definition 3.9 are equivalent, in that: 

1. Given a diffeomorphism (p : X ^ X satisfying (3.44), the map 

a = (p*, (3.55) 

i.e., (x{f) = f o (p, is linear, invertible, and satisfies (3.45) - (3.46). 

2. Given an invertible linear map a : C°°(2f) —>■ C°°(X) that satisfies (3.45) - (3.46), 
there is a unique diffeomorphism (p : X ^ X inducing a as in (3.55). 

3. This correspondence defines an anti-isomorphism between the group Diff(2f,B) 
of diffeomorphisms ofX satisfying (3.44) and the group Aut(C'”(2f), { , }) of in¬ 
vertible linear maps (X : C°°{X) —>■ C°°{X) that satisfy (3.45) - (3.46). 

Here an anti-isomorphism of groups is just an isomorphism that inverts the order of 
multiplication. This complication may be removed by writing (p^^ instead of (p in 
(3.55), but that change would make the next proposition a bit less natural. 

Proof The first claim is true by construction. The hard part is the second claim, 
which follows from a more general result about manifolds (note that in our termi¬ 
nology, manifolds are by definition assumed to be Hausdorff): 

Proposition 3.11. Let X and Y be a smooth manifolds. Then (3.55) establishes a 
bijective correspondence between linear maps (X : C°°{X) —>■ C°°{Y) satisfying (3.45) 
and smooth maps (p :Y ^ X. 
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The proof is quite similar to a central part of the proof of Gelfand duality for commu¬ 
tative C*-algebras, in which (3.55) establishes a bijective correspondence between 
C*-homomorphisms a : C{X) ^ C{Y) and continuous maps (p : Y ^ X, where X 
and Y are compact Hausdorff spaces; see §C.3 and especially Proposition C.22. 

For any commutative real algebra A, let 2^(A) be the space of non-zero algebra 
homomorphisms O); A —M (these are just the non-zero multiplicative linear maps), 
equipped with the weakest topology that makes each function d : Z(A) —M contin¬ 
uous, where d{co) = (o{a). Furthermore, if B is another commutative real algebra, 
then any homomorphism a : A ^ B induces a continuous map a* : X{B) ■E(A) 
in the obvious way, that is, by a*® = 0 ) o a. In the special case A =C°°{X) (and 
similarly if A = C{X)), one has a canonical map ev^ : X —>■ E{C{X)), given by 
evj(/) = f{x). The whole point (in which the entire difficulty of the proof lies) 
is that this map is a bijection (see Proposition C.21), which simultaneously equips 
X with a smooth structure that makes ev^ a diffeomorphism (by definition of the 
smooth structure on Z(C(2f)). In view of all this, given a multiplicative linear map 
a : C°° (X) ^ C°° (Y), we obtain a continuous map (p :Y ^Xhy 

^ = (ev^)^^ oa* oev^. (3.56) 

Eq. (3.55) then holds by construction. Smoothness of <p, then, is a consequence of 
the fact that «(/) = fo(p must be a smooth function on Y for any / G C°°(2f). 

Applying this to the setting of Theorem 3.10 easily yields all claims. □ 

In what follows, we look at smooth actions of Lie groups on (Poisson) manifolds 
X, in other words, at homomorphisms <p:G^ Diff(2f) or ^ : G —Diff(2f, B), where 
G is a Lie group, Diff(2f) is the group of all diffeomorphisms of a manifold, and 
Diff(2f,B) is the group of all diffeomorphisms of a Poisson manifold preserving 
the Poisson structure. Poregoing the underlying differential geometry, we take a 
pragmatic attitude and only study linear Lie groups, defined as closed subgroups G 
of GL„(]R) or GL„(C), with group multiplication given by matrix multiplication and 
hence group inverse being matrix inverse. Here one may think of SU (2) C GL 2 {C) 
or 5G(3) C GL 3 (]R), but also abelian Lie groups like the additive groups M” fall 
under this scope, since one may identify a G M” with the 2n x 2n-matrix 



in which case matrix multiplication indeed reproduces addition. Similarly, the 2n + 
1-dimensional Heisenberg group H„ is the group of real {n + 2) x {n + 2)-matrices 

/ 1 c-l- ja^b\ 

{a,b,c) = j 0 1„ b I , (3.58) 

Vo 0 1 ; 

where a,b G K", c G M, and a^b = {a,b)', this gives the multiplication rule 

{a,b,c) ■ {a',b',c') = {a + a',b + b',c + c' — ^{{a,b') — {a',b))). (3.59) 
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If G is a linear Lie group, its Lie algebra g may be defined as the vector space 

0 = {AgM„(K) le'^^GGVf gK}, (3.60) 

where K = K or C, as determined by the embedding G C GL„(R)) or G C GL„(C). 
Either way, p is seen as a real vector space, equipped with the Lie bracket 

[A,B\=AB-BA. (3.61) 

This is trivially a bilinear antisymmetric map 0 x g g satisfying the Jacobi identity 

[A, [B,C]] + [C, [A,B]] + [B, [C,A]] = 0, (3.62) 

which in turn expresses the fact that for fixed A G g the map : g —^ 0 defined by 

5a{B) = [A,B] (3.63) 

is a derivation of g with respect to its Lie bracket, i.e., 

5a{[B,C]) = [5a(B),C] + [B,5a(C)]. (3.64) 

The exponential map exp ; g —G is then just given by its usual power series, which 
for matrices is norm-convergent. Conversely, one may pass from G to g through 

A=jt{e%=0- (3.65) 

If G = K", we also have g = K”, and eq. (3.57) implies that exp is the identity map. 

For example, since SO(3) is the subgroup of GL 3 (K) consisting of matrices B 
that satisfy B^B — I 3 , its Lie algebra 50(3) consists of all matrices a that satisfy 
= —a. As a vector space have 50(3) = which follows by choosing a basis 

/OO 0 \ / 0 0 1\ /O -1 0\ 

7i = 0 0 -1 ,72 = 0 0 0 ,73 = 1 0 0 . (3.66) 

\01 0 / \-l 00 / \0 0 0 / 

of the 3x3 real antisymmeric matrices. The commutators of these elements are 

[7i,72]=73; [73,7i]=72; [72,73] =^ 1 . (3.67) 

For the Lie algebra of the Heisenberg group we obtain 1)„ = with basis 

/OO 0 \ /OeJ 0\ /OO l\ 

= 0 0 -e, , e, = 0 0 0 , Z = 0 0 0 , (3.68) 

\00 0 J Vo 00 / \oooJ 

where (ei,...,e„) is the usual basis of M", satisfying commutation relations 

[P„Qj] = 5ijZ- [Pi,Pj] = [Qi,Qj] = [Pi,Z] = [Qj,Z\ = 0. (3.69) 
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3.4 The momentum map 

Leaving out the Poisson structure for the moment, let X be a manifold, let G be a 
Lie group, and let ^ : G —Diff(X) be a homomorphism; as already mentioned, this 
corresponds to a smooth action (p : GxX ^ X, which we simply write as 

Y-x=(py{x) = (p{Y,x). 

In terms of the pullback (Py (/) = focpy, we then automatically have 

(p;{f8) = (p;if)(p;{8)- o-m 

For each A G g we then define a map 5a ■C°°{X) ^ C°° {X ) by 

5a/W = ••^)|f=o- (3-71) 

This map is obviously linear. Moreover, it can be shown that 5 is well behaved; 

Proposition 3.12. The map 5 ; g -G Der(C“’(X)), A i-4 5^ is a homomorphism of Lie 
algebra, i.e., each 5 a is a derivation, 5 is linear in A, and, for each A,B G g, 

[ 5 a , 5 b ] = 5 [^, b ]. ( 3 . 72 ) 

The proof relies on Hadamard’s Lemma, which we only need for complete vector 
fields, or, equivalently, for derivations with complete flow (i.e., defined for all t). 

Lemma 3.13. If 5 is a derivation of C°°{X) with complete flow (p, and f G C°°{X), 
then there is a function g{t,x) = gt{x) such that for all x and t, 

go{x) = 5f{x)- ( 3 . 73 ) 

fi^Ptix)) = fix)+tgtix). ( 3 . 74 ) 

Indeed, if the flow is complete one may take 

g,{x)= [ dsF{st,x), ( 3 . 75 ) 

Jo 

where F{t,x) = fi<Pt{x)) and (in Newton’s notation) F is the time derivative of F. 

Proof To prove that 5a is linear in A, let ^ be the flow of 5a, i.e., <Ptix) = e^*^x. 
For B G Q, Hadamard’s Lemma with 5 5a and x e^‘^x then gives us 

fie^‘^e^‘^x) = f{(p,{e^“^x))=fie^‘^x)+tgt{e^“^x); 

= SBfix)+goix) = 5 b /( x ) + 5 a /( x ). ( 3 . 76 ) 

On the other hand, since A and B are matrices, we may use the CBH-formula 
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= ^-t(A+B)+\t^[AB]+ 0 {P) ^ ^3 77^ 

which gives +0{t^)), and hence 

j^f{e-^^e-^^x)\t=o = = 5 a + b /( x ). ( 3 . 78 ) 

Comparing (3.76) with (3.78) gives 5a+b = 5a + 5b- The property 8sa = s5a is triv¬ 
ial. We now prove (3.72). Within the (matrix) Lie algebra g we have 

J „—tAn^tA D 

[A,B] = - = -lim-^-. (3.79) 

at ' t 

Furthermore, for any g G G one has * = ge^g^ *, so linearity of 5 gives 
5 [a,b]/W = -lim j {5,-, ABe>A fix) - 5b fix)) 

= hm ^ ( ^fie-‘^e^‘^e‘^x) - ^fie^’^x)) 
t-5-o t \as as J 

= lim - ifie-*^e^‘^e‘^x)-fie-‘^e‘^e^‘^x)) 

= lim — if o(p, ie^^e‘^x) —focp, ie'^e^^x)) 

s,t ^0 St 

= lim (i {fie^^e^^x)-fie‘\^^x)) + - {g,ie^^e‘\) -g,ie*^e^^x))\ 

s,t-^0 \St S ) 

= [5a, 5b] fix), 

since in the limit t -aQ the third term in the penultimate line cancels the fourth. □ 

Now suppose that, in addition, X is a Poisson manifold, and that each (py acts on 
X as a Poisson symmetry, in that 

(P;b^B, (3.80) 

cf. (3.44), or, equivalently, cf. (3.46), 

(p;i{f,8}) = {(p*rif),(p*rig)}- (3.81) 

This implies, for each A G g, and each f,g G C°°(7f), 

^Ai{f,g}) = {5Aif),g} + {f,5Aig)}. (3.82) 

Compare this with the following property 5 a already has since it is a derivation; 

5Aifg) = 5Aif)g + f5Aig). (3.83) 

We may call a derivation 5 : C°°(X) —>^ C°°(X) satisfying the like of (3.82), i.e., 
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5{{f,8}) = {5{f),8} + {f,5{8)}, (3.84) 


a Poisson derivation. We are already familiar with a large class of Poisson deriva¬ 
tions: for each h G C°°{X), the corresponding map 5/, defined by (3.26) is a Poisson 
derivation (this follows from the Jacobi identity). Let us call a Poisson derivation of 
the kind 5/, inner. This raises the question if our derivations 5a are inner. 


Definition 3.14. A momentum map /or a Lie 8roup G actin8 on a Poisson manifold 
X is a map 

J:X~)-2* (3-85) 


such that for each A G Q, 


5a = 5j ^, 


where the function Ja G C°°{X) is defined by by 


(3.86) 


Ja{x) = {J{x),A) = J{x){A). 


In other words, for each A G Q and f G C°°(X) we must have 


(3.87) 


5Aif) = {JA,f}- (3.88) 

A Lie 8roup action admittin8 a momentum map is called Hamiltonian. 
Equivalently, a momentum map is a linear map 

(3.89) 

such that 5^ = the connection between the two definitions is given by 

Ja=J*{A). (3.90) 

The pullback notation J* would suggest that it is a map C‘”(p*) -G- C°°{X), which is 
not quite the case, but it is a near miss: we embed g ^ by A !->■ A, where 

A(0) = 0(A), so 7* : g —C°°(Jf) is the restriction of the pullback J* to g. Another 
near miss would be to read J* as the adjoint to J, which maps g** = g to the ‘dual’ 
X*, but since X may not be a vector space, this dual cannot be defined as in linear 
algebra, so instead of all linear maps from A to K we might as well say that it 
consists of all smooth functions on X. Either way, the symbol J* seems justified. 

Proposition 3.15. Let G be a connected Lie 8roup that acts on a Poisson manifold 
X. If this action is Hamiltonian (i.e., if it has a momentum map), then G acts on 
{X,B) by Poisson symmetries (in the sense that (3.81) holds). 

Proof. An easy computation shows that (3.82) holds. We omit the proof of the fact 
that for connected Lie groups this “infinitesimal” property is equivalent to (3.81); 
this relies on the fact that G is generated by the image of the exponential map. □ 

The converse is not true: if G acts by Poisson symmetries, the action is not neces¬ 
sarily Hamiltonian. Eor example, take X = with the unusual Poisson bracket 
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{f,8}ip,q)=p 


(dfdg 
\dp dq 


dfdi\ 

dq dp ) ' 


(3.91) 


and let G = K act on by {p,q) = {p^q + b). This action satisfies (3.81), and 
has a single generator 5 — ~d jdq. But there clearly is no function J G C°°(K^) such 
that {/,/} = —df/dq (it should be J{p,q) = —log{p), which is singular at p = 0). 
However, in most “everyday situations” momentum maps exist: 

1. TakeX = x with coordinates x = (p,q), where p = {pi,p 2 ,P 2 ) and 

q = {q^,q^,q^), equipped with the canonical Poisson bracket (3.34). 

a. Let G = K® act on X by 


(a,b)-(p,q) = (p + a,q + b). (3.92) 

This action is Hamiltonian, with momentum map 

■/(p,q) = (q,-p)- (3-93) 

b. Let G = SO{3) act on the same space X by 

/?-(p,q) = (/?p,f?q). (3.94) 


Also this action is Hamiltonian, with momentum map 

■/(p,q) = pxq- (3-95) 

2. Let G = SO{3) act on A = equipped with the Poisson bracket (3.43), through 
its defining representation. This action has a momentum map 

y(x)=x, (3.96) 

where we have identified g with by choosing the basis (3.66) of g, and have 
identified g* with g (and hence with also) by the usual inner product on . 

3. The previous example is a special case of the Lie-Poisson structure. Let G be a 
Lie group with Lie algebra g. Choose a basis (To) of g, with associated structure 
constants defined by the Lie bracket on g as 

[TaJb]=Y.CUTc. (3.97) 


We write 0 in the dual vector space g* as 0 = 0a w", where (cOa) is the dual 
basis to a chosen basis (7),) of g, i.e., (Oa{Th) = dab- In terms of these coordinates, 
the Lie-Poisson bracket on C“’(g*) is defined by 


{f,gm 


= cibQc 


df{e)dg{e) 

dda dOt 


(3.98) 


Equivalently, the Poisson bracket (3.98) may be defined by the condition 
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{A,B} = [A,B], (3.99) 

where A,Z? G g and A G C“’(g*) is the evaluation map A(0) = 0(A). 

Now G canonically acts on g* through the coadjoint representation, defined by 

(x-0)(A) = 0(x^Ux). (3.100) 


This action is Hamiltonian with respect to the Lie-Poisson bracket (3.98), the 
associated momentum map simply being the identity map g* —g*, as in (3.96). 
In other words, we have 

Ja=A, (3.101) 

whose correctness may be verified from the computation 

5^B(0) = = 

= 0([A,B]) = [A>](0) = {A,B}(0) 

= {-/A,fi}(0). 

4. Let X = T*Q for some manifold Q. e.g. g = M" and hence X = We take 

G = Diff(e), (3.102) 


i.e., the diffeomorphism group of Q. This is an infinite-dimensional Lie group (if 
described in the right way). The defining action of ^ G G on 2 induces an action 
called tp* on T*Q, given (in coordinates) by 


(p*{p,q) = 

W)' = 

p'i = 


« ditp-'yjq) 

h dq‘ 


(3.103) 

(3.104) 

(3.105) 


This may be taken as a definition, but in the language of differential geometry 
this comes down to the neater prescription that if 0 = Y.jPjdq-' € T*Q, then 
(p*9 € is the one-form that maps a vector X G T^{cj)Q to 9{(p^^{X)), i.e., 

i(p*9){X) = 9{(p;\x)), (3.106) 

where (p^^yx) = /dq^ is given componentwise by, cf. (3.52), 

='£^iyP^P^Xy (3.107) 

If 2 = and tp — R G SO{3), then, using , we find that (3.104) - 

(3.105) simply become ZZ*(p,q) = (ZZp,ZZq), as in (3.94). 
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Furthermore, if ^(q) = q + b, then the partial derivatives in (3.105) form the 
identity matrix, so that ^*(p,q) = (p,q + b). To show that the action of Diff(2) 
on T*Q is Hamiltonian and compute its momentum map, we need to know that 
the Lie algebra of Diff(Q) is the space Vec(X) of all vector fields on Q, with 
its canonical Lie bracket (3.61)! We will not prove this, but the exponential map 
exp ; p —G is given through the flow (p of the vector field on Q by (cf. (3.20)) 

= (p,. (3.108) 

Theorem 3.16. The action o/Diff(Q) on T*Q has momentum map 

Jx{p.q) = -Y.PjX\q), (3.109) 

j 

and hence is Hamiltonian. Moreover, this momentum map satisfies 


(3.110) 

Proof. First note that (pf^ = (p^t, so from (3.71), (3.108), and (3.104) - (3.105), 

S^f{p,q) = j^f{(P-t{p,q))\t=o 

^df d (d<pl{q)\ , V" ^ ^ .J I ^ 




|r=0 




' dt 


dX^{q) df \ i \ 


From this and (3.109), using the canonical Poisson bracket (3.34) we find 




Finally, verifying (3.110) is a simple exercise. □. 

Thus the momentum map is a generalization of (minus) the momentum, whence 
its name; the quantity in (3.95) is (minus) the angular momentum. These annoying 
minus signs could be removed by putting a minus sign in (3.86), but that would have 
other negative (sic) consequences. For example, with our sign choice one often has 

{Ja,Jb}=Jia,b], (3.111) 

in which case the accompanying map (3.89) is a homomorphism of Lie algebras, 
or, equivalently, 7 is a morphism with respect to the given Poisson bracket on X 
and the Lie-Poisson bracket on g*. Such a momentum map is called infinitesimally 
equivariant, for if G is connected, (3.111) is equivalent to the equivariance property 

J{g-x)=g-J{x). (3.112) 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



100 


3 Classical physics on a general phase space 


Here the G-action on g* on the right-hand side is the coadjoint representation. 

All of this is true for our examples (3.95), (3.96), (3.101), and (3.109); in the 
latter case we note that the Lie bracket in the Lie algebra of Diff(2) is minus the 
commutator of vector fields. However, (3.111) does not always hold (in which case 


a fortiori also (3.112) fails). For example, it fails for (3.93): if we take the usual 
basis (e,f) = {ei,e 2 ,ei,f\,f 2 ,f 3 ) of g = and relabel ej = Qj and f = —Pi, then 

JpM=PC, (3.113) 

= qj, (3.114) 

cf. (3.93), and hence, although [Pi,Pj\ = [Qi,Qj\ = [PiiQj] = 0, we obtain 

= (3.115) 

{■IPi,jQj} = (3.116) 

Fortunately, in cases like that one can often find a central extension of G (see 


§5.10 below for notation) that acts on X through its quotient group G and does have 
an infinitesimally equivariant momentum map. In the case at hand, the Heisenberg 
group Ht, does the job, whose central elements (0,0, c) then act trivially on R®. In 
terms of the generators (3.68) we take Jp. and Jq. as in (3.113) - (3.114), and add 
Jz = 1r 6; according to (3.69) and (3.115) - (3.116) we then have (3.111), as desired. 

Finally, the above formalism leads to a clean formulation of Noether’s Theorem, 
providing the well-known link between symmetries and conserved quantities: 

Theorem 3.17. Let X be a Poisson action equipped with a Hamiltonian action of 
some Lie group G(so that there is a momentum map J \ X ^ q*}. Suppose h G C°° {X ) 
is G-invariant, in that hly-x) = h{x) for each 7 G G andx G X. Then for each A G g, 
the function is constant along the flow of the vector field X^. In other words, 

JA{(Pt{x)) =Ja{x) (3.117) 

for any x GX and any t G R/or which the flow (pt(x) ofX^ is defined. 

Proof Using all assumptions as well as the definition of a flow, we compute: 

^^JA((Pt(x)) =X/,(JA)(<Pt(x)) = Sh(JA)(<Pt(x)) 

= {h,JA}((Pt(x)) = -{JA,h}((Pt(x)) 

= -5A{h){(p,{x)) = 

= = 0- n 

For example, a Hamiltonian (3.38) has conserved (angular) momentum if the poten¬ 
tial V is translation (rotation) invariant, reflecting (3.93) and (3.95), respectively. 
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Notes 

The traditional symplectic approach to classical mechanics, culminating in the mo¬ 
mentum map, is exhaustively covered in Guillemin & Sternberg (1984) and Abra¬ 
ham & Marsden (1985). A founding paper for Poisson geometry is Weinstein 
(1983). The modern Poisson approach to mechanics may be found in Marsden & 
Ratiu (1994), from which most of the material in this chapter originates. 

Our proof of Proposition 3.11 is based on Navarro Gonzalez & Sancho de Salas 
(2003), §2.1. Burtscher (2009) is a nice survey of many similar results. 





Chapter 4 

Quantum physics on a general Hilbert space 


In this chapter we generalize the results of Chapter 2 to infinite-dimensional Hilbert 
spaces. So let // be a Hilbert space and let B{H) be the set of all bounded op¬ 
erators on H. Here a notable point is that linear operators on finite-dimensional 
Hilbert spaces are automatically bounded, whereas in general they are not. Thus we 
impose boundedness as an extra requirement, beyond linearity. This is very con¬ 
venient, because as in the finite-dimensional case, B(H) is a C*-algebra, cf. §C.l. 
At the same time, assuming boundedness involves no loss of generality whatsoever, 
since we can alway replace closed unbounded operators by bounded ones through 
the bounded transform, as explained in §B.21. Nonetheless, even the relatively easy 
setting of bounded operators leads to some technical complications we have to deal 
with. First, Definition 2.1 must be adjusted as follows: 

Definition 4.1. Let H be a Hilbert space. 

1. A (quantum) event is a closed linear subspace L ofH. 

2. A density operator is a positive trace-class operator p on H such that Tr (p) = 1; 
we continue to denote the set of all density operators on H by ^{H). 

3. A (quantum) random variable is a bounded self-adjoint operator on H. 

4. The spectrum <j{a) of a bounded operator a is the set of all X G Cfor which the 
operator a — X is not invertible in B{H) (cf. Definition B.80). 

As shown in Corollary B.88, if H is finite-dimensional this notion of a spectrum 
reduces to the set of eigenvalues of a. Even H is infinite-dimensional, the spectrum 
of a self-adjoint operator a is real (i.e., a{a) C K); this is also true if a is unbounded 
(see Theorem B.93). For any H, unit vectors \j/ still define special density matrices 
e^/, as in (2.7); we will later see that these are pure states on B{H), although the 
set of pure states is no longer exhausted by such density matrices. Finally, quantum 
events in H still bijectively correspond with projections on //; see Proposition B.76. 
The Born rule as well as the correspondence between density matrices and states 
require a separate discussion, to which we now turn. 
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4.1 The Born rule from Bohrification (ii) 

In this section we extend the characterization of the Born rule in §2.5, which was 
restricted to finite phase spaces X and finite-dimensional Hilbert spaces H, to the 
general case. Recall that a probability space is a measure space (X ,Z,/r) for which 
pi{X) = 1, and that, for compact X, a state on C(X) is a positive map (p : C{X) ^ C 
that is positive and satisfies (p{lx) = 1- Theorem B.15 and Corollary (B.17) yield; 

Theorem 4.2. LetX be a compact Hausdorjf space. There is a bijective correspon¬ 
dence between probability measures p onX and states CO on C{X), given by 

coif)= [ dpf, f€CiX). (4.1) 

Jx 

More precisely, the correspondence in question is between complete regular proba¬ 
bility spaces {X,E,p) and states on C{X), and this is understood in what follows. 

Second, we recall that if // is a Hilbert space and a G B{H), then C*(a) is the 
C*-algebra generated by a and 1h (i.e., the norm-closure of the algebra of all poly¬ 
nomials in a). Theorems B.84, B.94, and B.93 give the following spectral theorem: 

Theorem 4.3. If a* = a G B{H), then C*{a) is commutative, (7(a) C M is compact, 
and there is an isomorphism of (commutative) C*-algebras 

C(a{a))^C*{a), (4.2) 

written f i—^ f{a), which is unique if it is subject to the following conditions: 

1. the unit function laia) : >■ 1 corresponds to the unit operator In; 

2. the identity function idjy(^) \ X i-G X is mapped to the given operator a. 

Furthermore, this continuous functional calculus satisfies the rules 

itf + g)ia) = tf{a) +gia); (4.3) 

(/^)(a) =/(ak(a); (4-4) 

f(a)*=r{a). (4.5) 

Combining Theorems 4.2 and 4.3 gives a result of great importance: 

Corollary 4.4. Let H be a Hilbert space, let a* = a G B(H), and let xp G H be a unit 
vector. There exists a unique probability measure Px^/ on the spectrum (7(a) such that 

{\j/,f{a)\i/)= [ dp^f, f GC{a{a)). (4.6) 

Ja(a) 

In terms of the spectral projections e^ = 14(a) (defined for Borel sets A C (7(a)) 
constructed in (B.305) - (B.307) and Theorem B.102, the Born measure is given by 

Pxf>(A) = \\eAwf- (4.7) 
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More generally, a density operator p G &{H) induces a unique probability measure 
Pp on <7 (a) for which 

Tr(p/(a))=/ dppf, f€Cia{a)),-, (4.8) 

J (y{a) 

Pp{A)=Tv{peA). (4.9) 

This measure on (j{a) is called the Born measure (defined by a and \j/ or p). 

Proof The point is that the map /(v/,/(a)i//) defines a state on C{<j{a))\ 

• Linearity follows from linearity of the continuous functional calculus / i-G /(a); 

• Positivity follows because if f > 0, then f = Jf ■ -v/f, so that by (4.4) and (4.5), 
(V/,/(a)v/) = ||v(/(«)rf > 0 ; 

• Unitality follows from Theorem4.3.1, i.e., (y/, la(a)(^)¥} = (¥j ^Jt¥} = 1- 

To prove (4.7), use Lemma B.97 to approximate I 4 by functions f„ G C(o'(a)) as 
stated. By Theorem B.13.2 (i.e., the Lebesgue Monotone Convergence Theorem), 
we have dpx^fn dpx,, I 4 = Pxf,{A), whereas by (B.315) with a„ = f„{a), 

one has (v/,/„(a)vf) —>■ {¥^^a¥) = Ik/iV^lP- Hence (4.7) follows from (4.6). 

The proof for density operators is analogous. □ 

Defining the mean value (a) ^ of a with respect to the Born measure by 

(a)y,= / dpxf,{x)x, (4.10) 

J (:y{a) 

and similarly for p, using Theorem 4.3.2 we easily obtain 

(a)y, = (v/,av/); (4.11) 

(a)p=Tr(pfl). (4.12) 

As an important special case, suppose that (7(a) = (7p(a) (i.e., each X G (7(a) is 
an eigenvalue); this always happens if H is finite-dimensional. Eq. (A.57) then gives 

{¥,f{a)¥)= E /(^)-llarf, 

?Le(j{a) 

where ei is the projection onto the eigenspace = {¥ € H \ a\j/ = X\j/}. Thus 

Mv(^) = llarf, (4.13) 

and using the notation P^ia = X) for p^,{X), eq. (4.11) just becomes 

(a)^= Y. = (4.14) 

Xea{a) 

It is customary to extend the Born measure on ( 7 (a) C K to a (probability) measure 
on all of K by simply stipulating that 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



106 


4 Quantum physics on a general Hilbert space 


(4.15) 

we will often assume this and omit the prime. This obviously implies that ) = 0 
for any Borel set 4 C K disjoint from (7(fl); in particular, if a{a) is discrete, then 
Hif, is concentrated on the eigenvalues X of a, in that 

^tfi{A) = ^ |J,^/{X). (4.16) 

XeAnaia) 

To state an interesting property of the Born measure we need Hausdorff’s solu¬ 
tion to the relevant special case of the famous Hamburger Moment Problem: 

Theorem 4.5. If K is compact, then any finite measure p on K is determined 

by its moments 

a„= f dpix)/'. (4.17) 

JK 

Using f{x) = x" in (4.6), we therefore obtain; 

Corollary 4.6. The Born measure Py/ is determined by its moments 

a„ = (V/,a"i/). (4.18) 

More precisely, we need to be sure that numbers (a„) of the kind (4.18) are the 
moments of some (probability) measure. This follows from the spectral theorem by 
running the above argument backwards, but one may also use the general solution 
of the Hamburger Moment Problem, which we here state without proof; 

Theorem 4.7. A sequence of real numbers {(Xn) forms the moments of some measure 
p on's, iff for all N G'N and {fii,..., Pn) G one has j3nj3ma”+'” > 0. 

Furthermore, if there are constants C and D such that |o:„| < CD"n\, then p is 
uniquely determined by its moments (a„). 

These conditions are easily checked from (4.18). 

If a is unbounded, but still assumed to be self-adjoint (in the sense appropriate 
for unbounded operators, cf. Definition B.70), the spectrum (7(fl) remains real (see 
Theorem B.93) but it is no longer compact. Nonetheless, the Born measure on a{a) 
may be constructed in almost exactly the same way as in the bounded case, this time 
invoking Corollary B.21 and Theorem B.158 instead of Theorems 4.2 and B.94, 
respectively. Corollary 4.4 then holds almost verbatim for the unbounded case; 

Corollary 4.8. Let H be a Hilbert space, let a* = a, and let xp £ H be a unit vector. 
There exists a unique probability measure p^r on the spectrum cy{a) such that 

{\l/,f{a)Y)= [ dp^f, f eCoi(j{a)). (4.19) 

J(y{a) 

Also, eqs. (4.7) and (4.9) hold, as does (4.8), with f £ Co(cj(a)). 
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There is no need to worry about domains, since even if a is unbounded, f{a) is 
bounded for / G Cb((7(a)), and hence also for / G Co((j(a)). 

The physical relevance of the Bom measure is given by the Born rule: 

If an observable a is measured in a state p, then the probability Pp{a G A) that the 
outcome lies in 4 C K is given by the Born measure Pp defined by a and p, i.e., 

Pp{a G A) = Pp[A). (4.20) 

As in the finite-dimensional case, the Born measure may be generalized to fami¬ 
lies (ai,... ,a„) of commuting self-adjoint operators. Assuming these are bounded, 
the C*-algebra C*(ai is defined in the obvious way, i.e., as the smallest C*- 

algebra containing each a,, or, equivalently, as the norm-closure of the algebra of all 
finite polynomials in the (ai,... ,a„). This C*-algebra is commutative, as a simple 
approximation argument shows: polynomials in the a, obviously commute, and this 
property extends to the closure by continuity of multiplication. However, even in the 
bounded case, the correct notion of a joint spectrum is not obvious. In order to mo¬ 
tivate the following definition, it helps to recall Definition 1.4, Theorem C.24, and 
especially the last sentence before the proof of the latter, making the point that the 
spectrum a{a) of a single (bounded) self-adjoint operator coincides with the image 
of the Gelfand spectrum E(C*(a)) in C under the map O) (o{a). 

Definition 4.9. 7. The joint spectrum a{a) = a{ai,... ,a„) CR" of a finite family 
a = (ai,... ,a„) of commuting bounded self-adjoint operators is the image of the 
Gelfand spectrum E (C* (a i,..., a„)) = E {C* (a )) under the map 

r(C*(ai,...,a„)) O) (a)(ai),...,a)(a„)). (4.21) 

Since CO{ai) only utilizes the restriction of (0 to C*{ai) C C*{a), we have co(ai) G 
cy{ai) C K, so that E (C*(a)) C (j(ai) x • • • x is a compact subset ofW. 

To justify this definition, we note that: 

• For n = 1, this definition reproduces the usual spectrum, cf. Theorem C.24. 

• For n > 1 and dim(/7) < oo, we recover the joint spectrum of Definition A. 16. 

• For n > 1 and dim(77) = Weyl’s Theorem B.91 generalizes in the obvious 
way: we have X G <y{a) iff there exists a sequence ( 1 / 4 ) of unit vectors in H with 

lim||(a,-A,)V7-t||=0, (4.22) 

k^oo 

for each The proof is similar. 

One way to see the second claim is to use Proposition C. 14 joined with the ob¬ 
servation that, as in the case of A = B{H) for finite-dimensional 77, any pure state 
on a finite-dimensional C*-algebra A C B{H) is a vector state (2.42), too. To see 
this, we first specialize Theorem C.133 to the finite-dimensional case (where the 
proof becomes elementary), so that each state on C*(a) takes the form (2.33). Sub¬ 
sequently, we use the spectral decomposition (2.6), and use the definition of purity: 
suppose (o{b) = Ti{pb) = Y.iPi{'^iibVi) = T.iPi(Ovi{b) is pure, where b G C*(a). 
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Then (Bu, = CO for each i, so that CB is a vector state, say (o{b) = {\i/,b\l/) where \j/ is 
one of the Vj. Once we know this, suppose A = (Ai,...,A„)€ 0 '(a), with A, = (o{ai). 
Multiplicativity of CB implies that for any finite polynomial in n real variables we 
have {WjPicL)v) = p{^)^ which easily gives = XiXj/ for each /; for example, 
take p{x) = (xi — so that the previous equation gives || (a,- — A,) V/|p = 0. 

Conversely, if X is a joint eigenvalue of a, then by definition there exists a joint 
eigenvector xf/ whose vector state (o{b) = {\l/,b\l/) on C*(a) is multiplicative. 

Using this (perhaps contrived) notion of a joint spectrum. Theorem 2.19 now 
holds by construction also if dim(//) = where the pertinent isomorphism / i—> 
/(a) is given as in the single operator case, that is, by starting with polynomials and 
using a continuity argument to pass to arbitrary continuous functions. 

Theorem 2.18 and Corollary 4.4 then generalize to: 

Theorem 4.10. Let H be a Hilbert space, let a = (ai,... ,a„) be a finite family of 
commuting bounded self-adjoint operators, and let £ H be a unit vector. There 
exists a unique probability measure jly/ on the joint spectrum <7 (a) such that 

{\j/,f{q)\j/)= [ djiy,f, f ec{a{q)), (4.23) 

J a (a) 

or, equivalently, for special Borel sets ^ = A\ x ■ ■ ■ x A„C- (7{a), where Ai C cy{ai), 

Pif,{A) = \\eAi---eA„wf^ (4-24) 

where the = l/ij.(fl,) are the pertinent spectral projections (which commute). 

Similarly for density operators instead of pure states. 

If (some of) the operators a, are unbounded, we use the trick of §B.21 and pass 
to their bounded transforms bi, see Theorem B.152. We say that the bi commute iff 
the corresponding bounded operators bi do; this is equivalent to commutativity of 
all spectral projections of the ai. We then define, in self-explanatory notation, 

a{q) = {A(l - I A G a{b) n (-1,1)"}. (4.25) 

This leads to Bom measures on (7(a) defined either as in (4.23), with / G C(( 7 (a)) 
replaced by / G Co((7(a)), cf. (4.19), or as in (4.24). 

For example, if // = L^(K”) and a,V/(x) = x,V/(x), defined on the domain 

D{ai) = {r G L^(M") I / d'^xx}\w{x)\^ < -}, (4.26) 

JR" 

as in (B.242), then bi\j/{x) = x,(l +x?)^*/^V/(x), so that (7(b) = [—1,1]” and hence 
( 7 (a) = K". For a measurable region 4. C K" we then have Pauli’s famous formula 

11^(4.) = j^(/"x|r(x)p (4.27) 

for finding the particle in the region A, given that the system is in a pure state \j/. 
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4.2 Density operators and normal states 

Definition 2.4 of a state still makes good sense in the infinite-dimensional case, as 
it simply specializes the general definition of a state on a C*-algebra A to the case 
A = B{H). Thus we continue to say that a state on B{H) is a complex-linear map 
CO : B(H) -o- C satisfying (o{b*b) > 0 for each b G B{H) and Co{Ih) = 1- Despite 
this lack of novelty in the definition of a state (i.e., compared to finite-dimensional 
Hilbert spaces), Theorem 2.7 no longer holds if H is infinite-dimensional: although 
it (almost trivially) remains true that density operators p on H define states on B{H) 
through the fundamental correspondence co(a) = Tr (pa), a G B(H), cf. (2.33), there 
are (many) states that are not given in that way (see below). Fortunately, states that 
do arise through (2.33) can be characterized in a simple way. 

Definition 4.11. A state CO : B(H) — >■ C is called normal if for each orthogonal 
family (e,) of projections (i.e., e* = e,- and ejej = dijei) one has 


CO 



i 


(4.28) 


Here d defined as the projection on the smallest closed subspace K of H that 
contains each CjH (that is, Yi^i = i.e., the supremum in the poset jf^(H) of all 

projections on H with respect to the partial order e < f iff eH C fH). Furthermore, 
the sum over i on the right-hand side is defined by (B.l 1), i.e., as the supremum (in 
K) of the set of all sums YieF ^(^i) over finite subsets F G I of the index set I in 
which i takes values. It is finite because YieF^i ^1// cind hence, since CO is positive, 

= 1 - 

ieF 

For example, let (n,) be a basis of H with associated one-dimensional projections 


ei=\Vi)(Vi\. 


(4.29) 


If CO is assumed to be a state, then the additivity condition (4.28) implies 

= (4-30) 

i 

or, equivalently, using Definition B.6 etc. as well as the notation ep = YieF^F 

limcoiep) = 1. (4.31) 

F 

If H is separable, any orthogonal family (e,) of projections is necessarily countable, 
and (4.28) is analogous to the countable additivity condition defining a measure. 

Theorem 4.12. A state CO on B(H) takes the form co{a) = Tr (pa) for some (unique) 
density operator p G S!(H) iff it is normal. 
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Proof. First, eq. (2.33) implies (4.28). To see this, take the trace with respect to 
some basis (u^) of H that is adapted to the family (e,) in the sense that for each j, 
either etVj = Vj (i.e., Vj € e,i/) for one value of i, or e/Vj = Vj for all i. Then 

where the sum Y!] is over those j for which Vj G K = WieiH. On the other hand, 
since the basis is adapted, we have Vj G /T iff there is an i for which = Vj (since 
otherwise = 0 and hence Vj _L e,// for each i, so that V/ G J^^), so 

/ 

E®(e,) = ^Tr (pef) = J^J^(vj,peiVj) = sup ^ (vppvj) = J^(vj,pvj), 
i i i j jeJf j 

where Jp consists of those j for which Vj G Y^ieF^iH- This gives (4.28). 

Conversely, assume co is normal. For the e, in (4.28) we now take the projections 
(4.29) determined by some basis (i),). For each a G B(H) we then have 

(o{a) =lim(o{eFa). (4.32) 

F 

Indeed, using Cauchy-Schwarz for the positive semi-definite form {a,b) = (o{a*b), 
as in (C.197), and using Yi^i = 1// hence (o{a) = Co{Yi^ia) we have 

|a)(a) — a)(efa)p = |a)(efca)p < co{a*a)co{eFc) < ||a|pa)(efc), (4.33) 

since epc = Yn^F^i is a projection. Since ©(cf) + ( 0 {eFc) = a)(l//) = 1, eq. (4.31) 
gives limf ©(efc) = 0, so that (4.33) gives (4.32). For each finite F C I, the oper¬ 
ator epa has finite rank and hence is compact. According to Theorem B.146, the 
restriction of co : B{H) —C to the C*-algebra Bq{H) of compact operators on H is 
induced by a trace-class operator p, which (from the requirement that cu be a state) 
must be a density operator. Hence co{eFa) = Tr [pepa), and we finally have 

(o{a) = \\m(o{eFa) = limTr(peFa) = Tr(pa). (4.34) 

F F 

To derive the final equality, we rewrite Tr (pepa) = Tr (epap), cf. (A.78) and Propo¬ 
sition B.144, note that ap G B\{H), as shown in Corollary B.147, and observe 
that for any b G B\{H) we have limFTr(eFfo) = Tr(/7). To see this, simply com¬ 
pute the trace in the basis (u,) defining the projections e, through (4.29), so that 
Tr(eFp) = YieFi'^ijP'^i)^ ^nd note that by Definition B.6, 

=Y^{Vi,bVi) =Tr{b). 

^ ieF i 

Finally, suppose co{a) = Tr(pifl) = Tr(p 2 fl) for each a G B{H) and hence for 
each a G Bo{H). It follows from (B.476) that Tr (pa) = 0 for all a G Bo(H) iff p = 0. 
Hence pi = p 2 , i.e., a normal state co uniquely determines a density operator p. □ 
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If CO is normal, we may therefore use the spectral resolution (2.6) of the corre¬ 
sponding density operator p, i.e., p = where (u,) is some basis of H 

consisting of eigenvectors of p (which exists because p is compact and self-adjoint), 
and the corrsponding eigenvalues satisfy p,- > 0 and = 1; see the explanation 
after Definition B.148. Computing the trace in the same basis gives 


Tr{pa) =Y,Pi{Vi,aVi). (4.35) 

i 

We may characterize normality in a number of other ways. First note that because 
of the duality B\ (//)* = B(H) of Theorem B. 146, cf. (B.477), we may equip B(H) 
with the w*-topology in its role as the dual of the trace-class operators Bi{H), see 
§B.9; this means that a iff Tr (pa;^) —Tr (pa) for each p &B\(H), or, equiva¬ 
lently, for each p G ^{H), since each trace-class operator is a linear combination of 
at most four density operators, as follows from Lemma C.53 with (C. 8 ) - (C.9). The 
w*-topology on B{H), seen as the dual of B\ (H), is called the a-weak topology. By 
Proposition B.46, the ( 7 -weakly continuous linear functionals cp on B{H) are just 
those given by (p{a) = Tr (pb) for some trace-class operator b G Bi (H). 

Secondly, B{H) is monotone complete, in the sense that each net (a;^) of positive 
operators that is bounded (i.e., 0 < < c • 1// for some c > 0 and all X G A) and 

increasing (in that < ax/ whenever X < X') has a supremum a with respect to the 
standard ordering < on B{H)+, which supremum coincides with the strong limit of 
the net (i.e., lim;^ ax'^f = a\j/ for each \j/ G 77); the proof is the same as for Proposition 
B.98, and also here we write ax Z' a to describe this entire situation. 

Corollary 4.13. The following conditions on a state CO G S{B{H)) are equivalent: 

1. CO is normal, cf. Definition 4.11; 

2. CO{a) = lim;L (0{ax) if ax Z 

3. co(a) = Tr (pa) for some density operator p G Si(H); 

4. CO is cr-weakly continuous. 

Proof. We have seen 1 o 3 O 4, and 2 —1 is obvious, so establishing 3 —2 would 
complete the proof. To this effect, we first note that because the sum (4.35) is con¬ 
vergent, for e > 0 we may find a finite subset T C / for which Pi < e/2||fl|| 
(assuming a f 0). Since 0<ax<a also implies ax < ||a|| • In (since a < ||a|| • 1h), 
we therefore have | PiZij ~a)Ui)| < 2e/3, uniformly in A. Moreover, since 
F is finite and ax ^ a strongly, we can find Ao such that for all A > Aq we have 

\Y,PiZi,(ax-a)Vi) \ < e/3. (4.36) 

ieF 


Consequently, for such A, 


|Tr(p(a;i-fl))| 


< I (ax 

ieF i(^F 


^ M 2 1 

< 3e+3e 


= e. 


This shows that lim;^ |Tr (p (ax — a)) | = 0, so that assumption 3 implies no. 2. □ 
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We denote the normal state space of B{H), i.e., the set of all normal states on 
B{H) by S„{B{H)). It is easy to see from Definition B.148 that Sn{B{H)) is a convex 
(but not necessarily compact!) subset of the total state space S{B{H)). 

Corollary 4.14. The relation co(a) = Tr (pa) induces an isomorphism 

S„(B{H))^9(H) (4.37) 

of convex sets (i.e., CO ^ p). Furthermore, for the corresponding pure states we have 

P„{B(H))^I^i{H), (4.38) 

i.e., any pure state CO on Bq(H), as well as any normal pure state on B(H), is given 
by CO = COy/for some unit vector \j/ G H, where CO(a) = (\lf,a\p), cf. (2.42). 

The proof of (4.38) is practically the same as in the finite-dimensional case. From 
Theorem B. 146 we obtain another characterization of Sn{B(H)) and hence of &(H): 

Corollary 4.15. IfBo(H) is the C*-algebra of compact operators on FI, we have 

S(Br,(H)) = Sn{B(H))- (4.39) 

P(Bo(H)) = Pn(B{H)), (4.40) 

in the sense that any (pure) state CO on Bo(H) has a unique normal extension to a 
(pure) state co' on B(H), given by the same density operator p that yields CO. 

It can be shown that any state co G S{B(H)) has a convex decomposition 

co^tco„ + (l-t)cOs, (4.41) 

where t G [0,1], CO„ is a normal state, and cOs is called a singular state. In particular, 
since for t G (0,1) the state co is mixed, a pure state is either normal or singular. 

Singular states are not as aberrant as the terminology may suggest: such states are 
routinely used in the physics literature and are typically denoted by |A), where X lies 
in the continuous spectrum of some self-adjoint operator (that has to be maximal for 
this notation to even begin to make sense, see §4.3 below). Examples of such “im¬ 
proper eigenstates” are |x) and \p), which many physicists regard as idealizations. 
However, mathematically such states are at least defined, namely as singular pure 
states on B(H). The key to the existence of such states lies in Proposition C.15 and 
its proof, which should be reviewed now; we only need the case a* = a. 

Proposition 4.16. Let a = a* G B(Fl) have non-empty continuous spectrum, so that 
there is some X G <J (a) that is not an eigenvalue of a. Then a);i, (f(a)) = f(X) defines 
a pure state on A = C* (a), whose extension to B(H) by any pure state is singular. 

Proof. Normal pure states on B(//) take the form where xj/ G H is 

a unit vector and b G B(H). We know from Proposition C.14 that cOx is multiplicative 
on C*(a). However, if some multiplicative state co on C*{a) has the form co — CO^,, 
then xf/ must be eigenvector of a; cf. the proof of Proposition 2.3. □ 
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4.3 The Kadison-Singer Conjecture 

To obtain deeper insight into singular pure states, and as a matter of independent 
interest, we return to the Kadison-Singer problem, cf. §2.6. Recall that this problem 
asks if some abelian unital C*-algebra A C B{H) has the Kadison-Singer property, 
stating that a pure state (Oa on A has a unique pure extension co to B{H). Here the is¬ 
sue is uniqueness rather than existence, since at least one such extension exists: since 
A is necessarily unital (with 1^ = 1h) and (Oa is a state on A, so that in particular 
04(14) = II 04 II = 1^ Corollary B.41 gives the existence of a bounded extension (O 
satisfying 0(1//) = ||o|| = 1, which by Proposition C.5 is a state onB{H). Proposi¬ 
tion 2.22 then gives the existence of a pure extension co. As in the finite-dimensional 
case, the Kadison-Singer property forces A to be maximal (in the poset {B{H)) of 
all abelian unital C*-subalgebras of B{H), ordered by inclusion): 

Proposition 4.17. If some abelian unital C*-subalgebra A ofB{H) has the Kadison- 
Singer property, then A is necessarily maximal. 

Proof. We use the Gelfand isomorphism A = C{P{A)), where P{A) is the pure state 
space of A, cf. Theorem C.8 and Proposition C.14. If A has the Kadison-Singer 
property and ACBc B{H), where B is an abelian unital C*-subalgebra A of B{H), 
then (Oa has a unique pure extension co on B{H), which restricts to some state (Ob on 
B. The same reasoning as in the proof of Proposition 2.22 shows that (Ob is a pure 
state on B, so that we obtain a unique map 

P{A) P(B); (4.42) 

coa <- 0 - cob. (4.43) 

The inverse of this map is simply the pullback of the inclusion A ^ B, i.e., cob G 
P{B) defines (Oa G P(A) by restriction, so that we have a bijection P{A) = P{B), 
(Oa (G (Ob. Since for any pair of C*-algebras A C B the pullback S{B) -o- S{A) is 
continuous (in the pertinent w*-topology), the map (Ob (Oa is continuous. As in 
Lemma C.20, this implies that it is in fact a homeomorphism, so that A = B through 
the inclusion A^ B. This gives A= B, and hence A is maximal. □ 

Maximality of A implies A' = A, so that A is a von Neumann algebra, sharing the 
unit of B{H). To see the relevance of singular states for the Kadison-Singer prob¬ 
lem, we first settle the normal case. We know what it means for a state on B(H) 
to be normal (cf. Definition 4.11 and Corollary 4.13); for arbitrary von Neumann 
algebras A C B{H) the situation is exactly the same: we define normality by (4.28) 
and characterize it by the equivalent properties in Corollary 4.13, where the (7-weak 
topology on A may be defined either as the one inherited from B{H), or, more in¬ 
trinsically, and the w*-topology from the duality A = A*, where the Banach space 
A* is the so-called predual of A, e.g., and L°°(0,1)* = L* (0,1), cf. §B.9. 

Theorem 4.18. Let H be a separable Hilbert space and let (Oa be a normal pure 
state on a maximal commutative unital C*-algebraA in B{H). Then (Oa has a unique 
extension to a state (0 on B{H), which is necessarily pure and normal. 
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Proof. As noted after (4.41), a pure state on B{H) is either normal or singular. The 
possibility that (Oa is normal whereas O) is singular is excluded by Corollary 4.13.3, 
so 0) must be normal and hence given by a density operator. The proof of uniqueness 
is then the same as in the finite-dimensional case, cf. Theorem 2.21. □ 

We now recall the classification of maximal maximal abelian *-algebras (and 
hence of maximal abelian von Neumann algebras) A in B{H) up to unitary equiva¬ 
lence (cf. Theorem B.118). This classification is the relevant one for the Kadison- 
Singer problem, since, as is easily seen, A C B{H) has the Kadison-Singer property 
iff uAuT * C B{uH) has it. The uniqueness of the finite-dimensional case will be lost: 

Theorem 4.19. If H is separable and infinite-dimensional, and A C B{H) is a maxi¬ 
mal abelian *-algebra, then A is unitarily equivalent to exactly one of the following: 

7.L“(0,1)cB(l2(0,1)); 

2. r C B{f); 

3 . L“’(0,l)©r(K-) cB(L2(0,l)©f2(K-)), 

where = f“(N), f} = and K is either {!,... ,n}, in which case £^(7c) = C" 

and £°°{k) = D„(C), or K" = N, in which case £^(k) = £} and £°°{k) = £°°. 

This classification sheds some more light on Theorem 4.18. Since L°°(0,1) has no 
pure normal states and Z)„(C) has been dealt with in Theorem 2.21, the interesting 
case is 7°°. Using Corollary 4.13.3 (or the analysis below), it is easy to check that 
the normal pure states on are given by ©a)/) = f{x) for some x G N; these are 
vector state of the kind COxif) = {\f/,mf\i/) with \j/ = 5x, or, in other words, they are 
given by (0A{f) = Tr (pmf) with p = |5i)(5jc|. We now invoke a fairly deep result: 

Proposition 4.20. A pure state CO on B{H) is singular iff one (and hence all) of the 
following equivalent conditions is satisfied: 

• <o{a) =0for each a € Bq(H); 

• (o{e) = Ofor each one-dimensional projection e; 

• Li' ®(^i) = 0 for the projections et = |n,■)(!;,-j defined by some basis (Vi). 

One direction is easy: a normal pure state certainly does not satisfy the condition 
in question. For example, given (2.42) one may take a = |v/)(v/|, which as a one¬ 
dimensional projection lies inBo(//), so that a)y,(a) = 1. We omit the other direction 
of the proof. We conclude from this proposition that a pure singular state on B(£^) 
cannot restrict to a normal pure state on £°°, which reconfirms Theorem 4.18. 

We now study the Kadison-Singer property for each of the three cases in Theo¬ 
rem 4.19 (where the third will be an easy corollary of the first and the second). Since 
the proofs of the first two cases are formidable, we just sketch the argument. 

Theorem 4.21. • There exist (necessarily singular) pure states on L°°(0, 1) that do 
not have a unique extension to B{L^(0, 1)), and similarly for L°°{0, 1) ©f‘”(K'). 

• Any pure state on 7°° has a unique extension to B{£^). 
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The statement about is the Kadison-Singer Conjecture, which dates from 1959 
but was only proved in 2013. The first claim (which was already known to Kadison 
and Singer themselves) is equally remarkable, however, as is the contrast between 
the two parts of Theorem 4.21. In particular, Dirac’s notation |A) maybe ambiguous. 

The key to the proof of the first claim lies in the choice of a total countable 
family of normal states on L°°{0, 1), from which all pure states may be constructed 
by a limiting operation. Here we call a (countable) family of states on some 

C*-algebra A fofaZ if, for any self-adjoint a GA, the conditions (On{a)>Q for each n 
imply fl > 0 (the converse is trivial). For example, the well-known Haar basis {h„) 
of L^(0,1) provides such a family. The functions forming this basis are defined via 
some bijection j3 between the set of pairs {k,l) and N, e.g., ^{k,l) = k + 2\ by 

hn=Xp-Hn), («GN = {1,2,...}); (4.44) 

Xu{x) = 2*/2^(2^x- Z), (k G NU {0}, 0 < Z < 2^); (4.45) 

= l[o,i/ 2 ) ~ l[i/ 2 ,il- (4.46) 

Basic analysis then shows that the Haar functions hn form a basis of L^(0,1) and 
that the associated vector states (0„ on L°°{Q, 1) form a total set, where obviously 

(On{f) = {h„,mfh„)= j\lf. (4.47) 

Jo 

The relevance of total sets to the conjecture is explained by the following lemma. 
Lemma 4.22. IfT<Z 5(A) is a total set of states on a unital C*-algebra A, then 

5(A)=co(r)-; (4.48) 

P{A) C T-, (4.49) 

where co{T)^ is the w*-closure of the convex hull ofT in A* or in 5(A). 

Proof The inclusion co(7’)^ C S{A) is obvious, since T C 5(A) and 5(A) is a com¬ 
pact (and hence a closed) convex set. To prove the converse inclusion, suppose 
fl = a* G A and s G K are such that (o{a) > s for each (O GT. Then a)(fl — s • l^i) > 0 
and hence (o{a) > s for each (O G 5(A). Using Theorem B.43 (of Hahn-Banach 
type), this property would lead to a contradiction if 5(A) were not contained in 
co{T)-. 

The second claim, which is the one we will use, follows from the first through a 
corollary of the Krein-Milman Theorem B.50, stating that if T C K is any subset of 
a compact convex set K such that K = co{T)^, then dgK C T . This corollary may 
be proved (by contradiction) from Theorem B.43 in a similar way. □ 

Our next aim is to get rid of the closure in (4.49). The Haar basis yields a map 

/z:N ^ 5(L~(0,1)); (4.50) 

n^ (On, (4.51) 
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with image T, i.e., the set of Haar states. Since S{A) is a compact Hausdorff space (in 
its w*-topology), the universal property (B.135) of the Cech-Stone compactification 
j3N of N implies that h extends (uniquely) to a continuous map 

j3/i: j3N^5(A), 

whose image is compact and hence closed (since j3N is compact). Since T = /i(N) C 
S{A) we have T C j3/i(j3N) and hence T C j3/t(j3N), so that, from (4.49), 

P(L~(0,1)) C j3/t(j3N). (4.52) 

Hence each pure state cOc = (Ol°°{o,i) ™ 1) takes the form (Oc = (0^\ where 

®f>(/)=lima^(/)= i^{(On{f)\n&A}-, (4.53) 

Aeu 

and U € j3N is some ultrafilter on N, cf. (B.136). The point of this analysis, then, is 
that (Ou can immediately be extended to 1)) by the same formula, i.e., 

= lim®„(fl) = Pi I n G A}^, a G B(L^(0,1)), (4.54) 

^ Aeu 

where (0„{a) = {h „, ah„). If L°° (0,1) had the Kadison-Singer property, this were the 
unique extension of (Ou, and we will show that this leads to a contradiction. 

Apart from the use of ultrafilters, the technically most challenging part of the 
argument disproving the Kadison-Singer property forL°°(0,1) is as follows. If A = 
C([0,1]), for any / G A and any pure state co G ^’(A) there is some x G [0,1] such 
that Co{f) = /(x); see Propositions C.14 and C.19. For A = T°°(0,1) the situation is 
not that simple due to measure zero complications. Nonetheless, it is easy to show 
that for each positive f G T°°(0,1) and (Oc G P{L°°{Q, 1)) and each e > 0 one has 

li{{x G (0,1) I fix) G K(/) -£,«,(/) + £]}) > 0. (4.55) 

where p is Lebesque measure on (0,1). Taking the projection 

e = l{xG(0,l)|/(x)G[ffl,(/)-£/2,ffl,(/)+e/2]}. 

it follows that for each positive / G T°°(0,1), 0) G P(L“’(0,1)) and £ > 0 there exists 
aprojectione G ,^(L°°(0,1)) with ©(e) = 1 and \\ef — e(Ocif)\\ < Hard analysis 
then generalizes this property from T°°(0,1) to B(L^(0,1)), as follows: 

Lemma 4.23. If (Oc G P(L°°(0,1)) has a unique extension (O to B(L^(0,1)) (which is 
necessarily pure if it is unique), then for each a G B(L^(0,1)) and £ > 0 there exists 
a projection e G 2^(L°°(0,1)) with code) = 1 and 

||ea —ea)(a)|| < £. (4.56) 
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To derive a contradiction between (4.54) and (4.56), we use a bijection : N —>■ N 
that cyclically permutes the ordered subsets (2* + 1,... ,2^+'), A: = 0,1,..., that is, 
(1,2), (3,4), (5, 6 ,7,8), (9,..., 16), etc. This bijection induces a unitary operator 


m:L^(0,1) ^l2(0,1); (4.57) 

uhn = /tfo(„), (4.58) 

which is easily shown to have the following properties: 

(On{u) = 0, n G N; (4.59) 

\\eue\\ = 1, e€ l)),e 7 ^ 0. (4.60) 


To show that 1) fails to have the Kadison-Singer property, suppose it does, so 
that any (Oc € P(L^(0, 1)) has a unique extension co G P(B(L^(0, 1))). As already 
noted, we may then assume that cOc = (0c'^\ as in (4.53), whilst (O — (0^^\ as in 
(4.54). Taking a = u then gives (o{u) = 0, see (4.59), so that \\eu\\ < e by (4.56). But 
this contradicts (4.60), finishing the sketch of the proof of the first claim in Theorem 
4.21. The remark about L°°{Q, 1) ©f“’(jf) follows from the one about 1). 

We now pass to the (even) more difficult case of C Although this will 

not be used in the proof, it gives some insight to know which states on £°° we are 
actually talking about, i.e., the singular pure states, and compare this with (4.53). 

Theorem 4.24. There is a bijective correspondence 

C0dif)= [ dpf (4.61) 

Jn 

between states COd on £°° and finitely additive probability measures p on N, where: 

1. ( 0(1 is normal ijf p is countably additive (and hence is a probability measure). 

2. ( 0(1 is pure iff p corresponds to some ultrafilter U on N, in which case: 

( 0(1 is normal iffU is principal (and hence singular iffU is free). 

This follows from case no. 5 in §B.9, notably eqs. (B.153) - (B.154). In other words, 
the pure states (Od on are given by ultrafilters t/ on N through 

®r(/)=i3/(t/)=lim/(n); (4.62) 

the analogy with (4.53) is even clearer if we write f{n) = {5n,nifdn) = (On{f). If 
U = Un is a principal ultrafilter, n G N, we thus recover the normal pure states 

(/)=/(«)• (4.63) 

As in (4.54), we find at least one natural extension (O^^^ of to B{£^), namely 

(0^^\a) = Mm (On{a). (4.64) 
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We now show that that has the Kadison-Singer property, making the 
only extension of . The proof relies on an extremely difficult lemma from linear 
algebra (formerly known as a paving conjecture). We first define a linear map D : 
M„{C) —^ D„{C) by D{a)ii = a, 7 , i= and D{a)ij = 0 whenever i ^ j. 

Lemma 4.25. For any e > 0 there exist I G N such that for alln gN and a G M„ (C) 
with D{a) = 0, there are I projections (ei,... ,ei) in D„(C) such that 

i 

Y.ek = \n\ (4.65) 

k=\ 

WetaciW < e||(a!||, i=l,...,l. (4.66) 

Since this estimate is uniform in n, the lemma extends to where D : B{£^) -G 
is defined analogously, i.e., D{a) is diagonal in the canonical basis (5„) of £} with 

D(a)d„ = (On{a)dn, n G N. (4.67) 

Lemma 4.26. For any e > 0 there exist Z G N such that for all a G B{f') withD{a) = 
0 , there are I projections (ei,..., e/) in £°° such that 

i 

Y.ek = 1 h; (4.68) 

k=l 

WetaciW < e||fl||, /=1,...,Z. (4.69) 

Now suppose that cOd G P(£^), that O) G S(B(£^)) extends coj, and that a G B(£^) has 
D(a) = 0. Let e, be one of the projections in Lemma 4.26. Using Cauchy-Schwarz 
for the sesquilinear form (a,b) = (o{a*b), we obtain (using ej = e* = ei) 

\(o{eiaej)\^ < Co{ei)(o{eja*aei); (4.70) 

\(o{eiaej)\^ < (o(a*eia)(o(ef). (4.71) 

Since ©(e,) = (Od{ei) and (Od is a pure state (and hence is multiplicative), we have 
(o{ei} G {0,1}, since e, is a projection. Moreover, in view of (4.68) and the nor¬ 
malization (o{\h) — 1 , there must be exactly one value of i = 1 ,...,Z, say i = /q, 
such that co(eig) = 1, and a)(e,) = 0 for all i f Zp. Eqs. (4.70) - (4.71) there¬ 
fore imply that (o{eiaej) f 0 iff i = j = io- Using (4.68) once more, we see that 

®(a) = tijtoieiaej) = a)(e,Qae;J, so that |a)(fl)| < ||a)||||e;„fle,J| < 1 •e||fl|| by 
(4.66). Letting e —> 0, we proved: 

Lemma 4.27. Ifco G S{B{£^)) extends (Od G P{£°°), andD{a) = 0, then (o{a) = 0. 
Since — D, we have D{a — D{a)) = 0, so that for any a G B(£^), we have 

( 0 (a) = 0 )(D(a)) = (Od{D(a)), (4.72) 

provided that (O extends (Od, as before. This shows that (O is determined by (Od and 
hence is unique, completing the proof (sketch) of Theorem 4.21. 
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4.4 Gleason’s Theorem in arbitrary dimension 

To a large extent the thrust and difficulty of the proof of Gleason’s Theorem 2.28 
already lies in its finite-dimensional version, but some care is needed in the gen¬ 
eral case, and also Corollary 2.29 needs to be refined. A major point here is that 
Definition 2.23 has no unambiguous generalization to arbitrary Hilbert spaces. 

Definition 4.28. Let H be an arbitrary Hilbert space with unit sphere H\. 

1. A probability distribution on ^{H) is a map p:H\ ^ [0,1] that satisfies 

^p{Vi) = 1, for any basis (o,) ofH, (4.73) 

iei 

where, as in %B.12, the sum (over a possibly uncountable index set) is meant as 
in Definition B.6. In particular, if H is separable and the basis is labeled and 
ordered fiy / = N, then it is an ordinary convergent sum of the kind • • •. 

2. A map P : IP{H) —>■ [0,1] that satisfies P{Ih) = 1 is called a: 

a. finitely additive probability measure if 

(4.74) 

\jeJ j jeJ 

for any finite collection {ej)jej of mutually orthogonal projections on H (i.e., 
CjH _L CkH, or equivalently, ejet = 0, whenever J k); this is equivalent to 
the condition P(e + f) — P{e) +P(f) whenever ef = 0, cf Definition 2.23.2. 

b. probability measure if (4.74) holds for any countable collection {ej)jej of 
mutually orthogonal projections on H, where the first sum is defined in the 
strong operator topology; note that the strong sum coincides with the 
supremum Vj ej of the given family, defined with respect to the usual ordering 
of projections (that is, e < f iff eH C fH). 

c. completely additive probability measure if (4.74) holds for arbitrary col¬ 
lections (efjej of mutually orthogonal projections on H (the first sum again 
meant in the strong operator topology, with the same comment as above). 

Thus a probability measure is by definition (7-additive in the usual sense of mea¬ 
sure theory; the other two cases are unusual from that perspective. However, if H is 
separable, then J can be at most countable, so that complete additivity is the same 
as (7-additivity and hence any probability measure is completely additive. Surpris¬ 
ingly, assuming the Continuum Hypothesis (CH) of set theory, it can be shown that 
this is even the case for arbitrary Hilbert spaces. The fundamental distinction, then, 
is between finitely additive probability measures and probability measures (which 
by definition are countably additive). As we shall see, this reflects the distinction 
between arbitrary and normal states on B{H), respectively, cf. §4.2. In what fol¬ 
lows, in dealing with non-separable Hilbert spaces we assume CH, in which case 
probability distributions on H are equivalent to probability measures on I^{H). 
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The proof is the same as in finite dimension (taking into account that infinite sums 
over projections are defined strongly). Even without CH, Gleason’s Theorem still 
holds for non-separable Hilbert spaces if we assume P to be completely additive, and 
probability distributions are equivalent to completely additive probability measures 
on ^(H). For separable Hilbert spaces, CH is irrelevant and unnecessary altogether. 
We then have the following generalization (and bifurcation) of Theorem 2.28. 

Theorem 4.29. Let H be a Hilbert space of dimension > 2. 

1. Each probability measure P on is induced by a unique normal state on 

B{H) via (2.122), i.e., 

P{e)=Tv{pe), (4.75) 

where p is a density operator on H uniquely determined by P. 

Equivalently, each probability distribution p on is given by (2.123), or 

p{v) = {v,pv). (4.76) 

Conversely, each density operator p on H defines a probability measure P on 
i^{H) via (4.75), as well as as a probability distribution p on via (4.76). 

2. Each finitely additive probability measure P on is induced by a unique 

state CO on B{H) via 

P{e) = (0{e), (4.77) 

and similarly each probability distribution p on is given by 

p{v) = co{ey). (4.78) 

Conversely, each state CO on H defines a probability measure P on via 

(4.77), as well as as probability distribution p on i^(H) via (4.78). 

Proof The proof of part 1 is practically the same as in finite dimension, except for 
the fact that in the proof of Lemma 2.33 the reference to Proposition A.23 should be 
replaced by Proposition B.79, upon which one obtains a bounded positive operator p 
for which (2.123) holds. The normalization condition (2.110) then yields Tr (p) = 1 
if the trace is taken over any basis of H, and since p is positive this implies p G 
B\(H), see §B.20 (complete additivity of P is just necessary to relate it to p). 

Unfortunately, the proof of part 2 exceeds the scope of this book (see Notes). □ 

In infinite dimension. Corollary 2.29 becomes more complicated, too; for one 
thing. Definition 2.26 of a quasi-state bifurcates into two possibilities. The one given 
still makes perfect sense and is natural from the point of view of Bohrification; to 
avoid confusion we call a map co : B{H) -G C satisfying the conditions in Defi¬ 
nition 2.26 a strong quasi-state. In the context of Gleason’s Theorem, a slightly 
different notion is appropriate: a weak quasi-state on B{H) satisfies Definition 2.26, 
except that linearity is only required on commutative C*-algebras in B{H) of the 
form C*(fl), where a = a* G B{H) (these are singly generated). Since commutative 
unital C*-subalgebras of B{H) are not necessarily singly generated, and a specific 
counterexample exists, weak quasi-states are not necessarily strong quasi-states. 
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Proposition 4.30. The map co i-?- gives a bijective correspondence between 

weak quasi-states CO on B{H) and finitely additive probability measures on 

Proof. For some finite family (ei,..., e„) of mutually orthogonal projections on H, 
add eo = 1// " if necessary and let a = with all Xj G M different. 

Then a(a) = {Aq, .. • ,A„}, so that C*(a) = C((7(a) = (cf. Theorem B.94) co¬ 
incides with the linear span of the projections ej. If w is a weak quasi-state, then it 
is linear on C*(a) and hence also on the ej, so that is finitely additive. 

Conversely, let /r be a finitely additive probability measure on 3^{H). If a = a* G 
B{H) is given, using the notation (B.328) we symbolically define O) on a by 

(0{a)= f dll{ex)X. (4.79) 

Jij{a) 

More precisely, for any e > 0 we use Corollary B. 104 to define <»£(a ) = Y!i= i K 
and let co{a) = limg^o (Oe(a); it follows from Lemma B.103 (or the theory underly¬ 
ing the Riemann-Stieltjes integral (4.79)) that this limit exists. Now let b,cG C*(a), 
so that b = f{a) and c = g{a) for certain f,g G C((7(fl)), and f>-|-c = {f -\-g){a), cf. 
Theorem B.94. By (B.325) we therefore have (Ug(fo-l-c) = T!i=i{f + 
which, since (/ + g)(A,) = /(A,) -|-g(A,), again by (B.325) equals -I- (Oe{c). 
Since this holds for every e > 0, letting e —0 we obtain (o{b-\-c) = (o{b) + (o{c), 
making (O linear on C*{a). It is clear that the quasi-state co thus obtained, on re¬ 
striction to reproduces p, making the map co CO^^(^h) surjective. Finally, 

injectivity of this map follows from Corollary B. 104. □. 

Corollary 4.31. Ifdim{H) > 2, then each weak quasi-state on B{H) (and a fortiori 
each strong quasi-state) is linear and hence is actually a state. 

This is immediate from Theorem 4.29.2. and Proposition 4.30. 

Another corollary of Gleason’s Theorem is the Kochen-Specker Theorem, which 
we will explain in detail in Chapter 6, where it will also be proved in a different way. 

Theorem 4.32. Ifdim{H) > 2, there are no weak quasi-states co : B(H) —>■ C whose 
restriction to each C*-subalgebra C*{a) C B{H) is pure (where a = a* G B{H)). 

Equivalently, there are no nonzero maps co': B(H)sa -G M that are: 

• Dispersion-free, i.e., co'{a^) = Co'(a)^ for each a G B(//)sa; 

• Quasi-linear, i.e., linear on commuting operators. 

Cf. Definitions 6.1 and 6.3. To see that these conditions are equivalent to those stated 
in Theorem 4.32 (despite the impression that linearity on all commuting self-adjoint 
operators seems stronger than linearity on each C*(a)), extend co' to co : B{H) 

C by complex linearity, as in Definition 2.26.1, and note that dispersion-freeness 
implies positivity and hence continuity on each subalgebra C* (a) (cf. Theorem C.52 
and Lemma C.4). We then see that the two conditions just stated imply that co is 
multiplicative on C*(a), and hence pure, see Proposition C.14, which conversely 
implies that pure states on C*{a) are dispersion-free. We now prove Theorem 4.32. 
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Proof. If e is a projection, then = e, so that Ct)(e^) = Co{e). Since co is dispersion- 
free (as just explained), we also have a)(e^) = tt)(e)^, whence (o{e)^ = (o{e) and 
hence Co{e) G {0,1}. Furthermore, since O) is a state by Corollary 4.31, we may ap¬ 
ply the GNS-construction, see Theorem C.88 (whose notation we use). In particular, 
for any projection e, using the fact that 7rai(e) = 7ra(e)*7ra,(e), by (C.196) we have 

(4.80) 

If Co(e) = 0, then 7:a)(e)Qco = 0 from the second equality. If (o(e) = 1, then 
noD{e)Pl(a = from the first inequality and Cauchy-Schwarz (in which we have 
equality, so that n(o{e)Pl(a = zPim for some z G T, upon which (4.80) forces z = 1). 

By the spectral theorem (e.g. in the form Corollary B.104) or the theory of von 
Neumann algebras, the linear span of 0^{H) is norm-dense in B(H). Since is 
cyclic for na,{B{H)) by the GNS-construction, it must be that = C • Pla, and 
hence Ttaia) = (o{a) ■ for any a G B{H). Since Ttaiab) = %a{a)'Ka{b) by the 
GNS-construction, this gives (o{ab) = (o{a)(o{b) for all a,b G B{H). However, such 
multiplicative states® on B{H) cannot exist if dim(//) > 1. This is clear if ® is 
normal, cf. Proposition 2.10, so that the following argument (which also covers the 
normal case) is especially meant for the case where ® is singular. 

1. If dim(//) = n < oo, there are n one-dimensional projections (ei,...,e„) such 
that = !//■ (indeed, we may assume that B{H) — Mn{C) and take diagonal 
matrices ei = diag(l,0,... ,0), etc.). Now for any pair {ei,ej) there is some v G 
B{H) (which by definition is a partial isometry) such that e, = vv*, ej = v*v (in 
the above case e, and ej are thus related if Vij = 1 and v,/y/ = 0 otherwise). Hence 

co(ei) = ®(vv*) = ®(v)®(v*) = ®(v*v) = co{ej), (4.81) 

since ® is multiplicative. But ® is also additive, which implies 

f ®(e,) = ® =®(l/r) = l. (4.82) 

Since also ®(e,) G {0,1}, eqs. (4.81) - (4.82) are clearly contradictory. 

2. If dim(//) = oo, separable or not, a similar contradiction arises from the halving 

lemma, which states that there is a projection e and an operator v such that e = 
vv*, \h — e = v*v. For example, in the separable case assume H = and take e 
the projection onto the closed linear span of the basis vectors (5^) with x G N 
even, so that 1/r — e projects onto the closed linear span of the basis vectors 
(5jc) with X G N odd. Then 0 take v = 0 on and v : —>■ any 

unitary operator. In general, a similar method works, for if / is a set indexing 
some basis of H one may find a subset E C I that has the same cardinality as its 
complement I\E, upon which £^{E) = £^{I\E), cf. Theorem B.63. 
Multiplicativity of co then leads to similar contradiction between the properties 
®(e) = ®(l/r — e), as in (4.81), and ®(e) + ®(1//— e) = ®(1//) = 1, as in (4.82): 
if ®(e) = 0 one finds 0=1, whereas ®(e) = 1 implies 2=1. □ 
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Notes 

§4.1. The Bom rule from Bohrification (ii) 

The Born measure (and its construction along the lines of this section) is well 
known in functional analysis, cf. Pedersen (1989), §4.5. For the Hamburger Mo¬ 
ment Problem see, for example, Reed, M. & Simon, B. (1975), Methods of Modern 
Mathematical Physics. Vol II. Fourier Analysis, Self-adjointness (New York: Aca¬ 
demic Press), Theorem X.4, p. 145 and Example 4, p. 205. In fact, the proof uses 
spectral theory! Corollary 4.6 was suggested by the treatment of the Born rule in 
Hall (2013). Definition 4.9 of the joint spectrum goes back (at least) to Arens (1961) 
and Hormander (1966), §3.1.13. 

§4.2. Density operators and normal states 

These are really results about von Neumann algebras and come from the pertinent 
literature; our proofs derive from Li (1992), §1.8 and Takesaki (2002), Ch. III. 

§4.3. The Kadison-Singer Conjecture 

As already mentioned in the notes to §2.6, the Kadison-Singer Conjecture was 
first discussed in Kadison & Singer (1959) and was finally proved by Marcus, Spiel- 
man, & Srivastava (2014ab), following important intermediate contributions by e.g. 
Anderson (1979) and Weaver (2004). For an introduction including a complete proof 
see Stevens (2016), and for applications of the conjecture and its proof to other ar¬ 
eas of mathematics see Casazza et al (2005) as well as Casazza & Tremain (2016). 
Proposition 4.20 is due to Glimm (1960). 

§4.4. Gleason’s Theorem in arbitrary dimension 

The extension of Gleason’s Theorem to non-separable Hilbert space assuming 
complete additivity of P is due to Maeda (1980). Maeda (1990) generalizes this 
result to von Neumann algebras without summands of type f. The proof that as¬ 
suming CH countable additivity implies complete additivity (and hence Gleason’s 
Theorem) was given by Eilers & Horst (1975). Proposition 4.30 is due to Aarens 
(1970), whose Theorem 1 is wrong: see Aarens (1991). The proof of Theorem 4.32 
is due to Doring (2004), using results of Hamhalter (1993). 
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Symmetry in quantum mechanics 


Roughly speaking, a symmetry of some mathematical object is an invertible trans¬ 
formation that leaves all relevant structure as it is. Thus a symmetry of a set is just a 
bijection (as sets have no further structure, whence invertibility is the only demand 
on a symmetry), a symmetry of a topological space is a homeomorphism, a sym¬ 
metry of a Banach space is a linear isometric isomorphism, and, crucially important 
for this chapter, a symmetry of a Hilbert space // is a unitary operator, i.e., a linear 
map u: H H satisfying one and hence all of the following equivalent conditions: 

• uu* = u*u = 1//; 

• M is invertible with = m*; 

• M is a surjective isometry (or, if dim(//) < just an isometry); 

• M is invertible and preserves the inner product, i.e., {u(p,u\i/) = (<p, xj/) (<p, \j/ G H). 

The discussion of symmetries in quantum physics is based on the above idea, but the 
mathematically obvious choices need not be the physically relevant ones. Even in el¬ 
ementary quantum mechanics, where A — B{H), i.e., the C*-algebra of all bounded 
operators on some Hilbert space H, the concept of a symmetry is already diverse. 
The main structures whose symmetries we shall study in this chapter are: 

1. The normal pure state space d^\ (H), i.e., the set of one-dimensional projections 

on H, with transition probability T : (H) x (H) —>■ [0,1] defined by (2.44). 

2. The normal state space ^(H), i.e. the convex set of density operators p on H. 

3. The self-adjoint operators B{H)sa, on H, seen as a Jordan algebra (see below). 

4. The effects S’{H) = [0, 1]b{//), seen as a convex partially ordered set (poset). 

5. The projections 3^(H) on H, seen as an orthocomplemented lattice. 

6. The unital commutative C*-subalgebras tf{B{H)) of B{H), seen as a poset. 

Each of these structures comes with its own notion of a symmetry, but the main 
point of this chapter will be to show these notions are equivalent, corresponding 
in all cases to either unitary or—surprisingly— anti-unitary operators, both merely 
defined up to a phase. The latter subtlety will open the world of projective unitary 
group representation to quantum mechanics (without which the existence of spin- \ 
particles such as electrons, and therewith also of ourselves, would be impossible). 
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5.1 Six basic mathematical structures of quantum mechanics 

We first recall the objects just described in a bit more detail. We have: 


^i{H) = {e € B{H) \e^ = e* =e,Tx{e)=A\m{eH) = iy, (5.1) 
^(//) = {pGB(//)|p>0,Tr(p) = l}; (5.2) 

B{Hy^ = {a G B{H) \ a* = a}; (5.3) 

^{H) = {aGB{H)\0<a<lH}; (5.4) 

^{H) = {eGB{H)\e^ = e* =e}- (5.5) 

‘^{B{H)) = {C C B(H) I C commutative C*-algebra, G C}. (5.6) 


The point is that each of these sets has some additional structure that defines what it 
means to be a symmetry of it, as we now spell out in detail. 

Definition 5.1. Let H be a Hilbert space (not necessarily finite-dimensional). 

1. AWigner symmetry (ofH) is a bijection 

\N : ^ ^i{H) (5.7) 

that satisfies 

Tr(W(e)W(/)) =Tr(e/), ej€ (5.8) 

2. A Kadison symmetry is an affine bijection 

K: S>(H)^ (5.9) 

i.e. a bijection K that preserves convex sums: for f G (0,1) and Pi,P 2 G S!(H), 
K(tpi + (l-f)p2) =fKpi + (l-f)Kp2. (5.10) 

3. a. A Jordan symmetry is an invertible Jordan map 

J : ^ B(//)sa, (5.11) 

i.e., an ^.-linear bijection that satisfies the equivalent conditions 

J(aop) = J(a)o J(p); (5.12) 

J(a2) = (5.13) 

Here 

aob = \(ab-\-ba) (5.14) 

is the Jordan product on B(H)sn, which turns the (real) vector space B{H)sn 
into a Jordan algebra, cf. §C.25. 

b. A weak Jordan symmetry is an invertible weak Jordan map, i.e., a bijection 
(5.11) of which the restriction is a Jordan map for each C G 
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4. A Ludwig symmetry is an affine order isomorphism 

(5.15) 

5. A von Neumann symmetry is an order isomorphism 

N: ^{H) (5.16) 

preserving orthocomplementation, i.e. N(1 — e) = 1 — N{e) for each e G 

6. A Bohr symmetry is an order isomorphism 

E:^{B{H))^^{B{H)). (5.17) 

In nos. 3 and 5-6, an order isomorphism 0 of the given poset is a bijection that 
preserves the partial order < (i.e., if x < y, then 0(x) < 0(y)) and whose inverse 
0^' does so, too; cf. §D.l. The names in question have been chosen for historical 
reasons and (except perhaps for the first and third) are not standard. 

Let us note that any Jordan map has a unique extension to a C-linear map 

(5.18) 

Jc(a*) = Jc(fl)*, (5.19) 

which satisfies (5.12) for all a,b, as well as 

}£{a + ib) = }{a) + i}{b), (5.20) 

with notation as in Proposition 2.6. Conversely, such a Jordan map (5.18) defines 
a real Jordan map (5.11) by J = Similarly, a weak Jordan symmetry is 

equivalent to a map (5.18) that satisfies (5.19), preserves squares as in (5.13), and is 
linear on each subspace C of B{H), with C G '^{B{H)). In other words (in the spirit 
of Bohrification), Jc is a homomorphism of C*-algebras on each commutative unital 
C*-subalgebra C C B{H). Therefore, either way J and Jc are essentially the same 
thing, and if no confusion may arise we call it J. Note that a weak Jordan map J a 
priori satisfies (5.12) only for commuting self-adjoint a and b. It follows that weak 
(and hence ordinary) Jordan symmetries are unital: since 

J(l7)=J(lrrofi) = J(l^)oJ(fi) (5.21) 

for any b, we may pick b = (1//) to find, reading (5.21) from right to left, 

J(lrr) = J(l//)ol;, = l^. (5.22) 

The special role of unitary operators u now emerges: each such operator defines 
the relevant symmetry in the obvious way, namely, in order of appearance: 
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W(e) = ueu*'. 

(5.23) 

K(p) = upu*'. 

(5.24) 

L(a) = MOM*; 

(5.25) 

J(fl) = MflM*; 

(5.26) 

N(e) = ueu*'. 

(5.27) 

B(C) = uCu*, 

(5.28) 


where a* = a in (5.26). If not, this formula remains valid also for the map Jc. Fur¬ 
thermore, in (5.28) the notation uCu* is shorthand for the set {uau* \ a G C}, which 
is easily seen to be a member of Here, as well as in the other three cases, 

it is easy to verify that the right-hand side belongs to the required set, that is, 

ueu* G upu* G ^(H), upu* G <?(//), (5.29) 

uau* G B{H)sa, upu* G ^{H), uCu* G ^{B{H)), (5.30) 

respectively, provided, of course, that 

eG^l{H), pG&{H), aG£’{H) aGB{H)s^, eG^iH), CG^{B{H)). 

Indeed, if, in (5.23), e = e^, — |</)(</| for some unit vector xj/ G H, then 

= guy. (5.31) 

Ifp > 0 in that {\if,p\if) > 0 for each G H, then clearly also upu* > 0, and if 
Tr(p) = 1, then also Tr (upu*) = l.lf a* = a, then 

(uau*)* = u**a*u* = uau*. (5.32) 

However, one may also choose u in these formulae to be anti-unitary, as follows: 

Definition 5.2. 7. A real-linear operator u : H H is anti-linear if 

u{zxi/) = zxi/ {z G C). (5.33) 

2. An anti-linear operator u : H ^ H is anti-unitary if it is invertible, and 

{u(p,uxi/) = {(p,xi/) {(p,xi/GH). (5.34) 

The adjoint u* of a (bounded) anti-linear operator u is defined by the property 

{u*(p,xi/) = {(p,uxi/) {(p,xi/GH), (5.35) 

in which case u* is anti-linear, too. Hence we may equally well say that an anti-linear 
operator is anti-unitary if uu* = u*u — In- The simplest example is the map 
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7 : C" ^ C"; 

Jz = z, (5.36) 

i.e., if z = (zi,... ,Zn) G C", then {Jz)i = Zi- Similarly, one may define 

7Vf = V, (5-37) 


and likewise on L^, where complex conjugation is defined pointwise, that is, 

{Jw){x)=W)- (5.38) 

For any Hilbert space one may pick a basis (t;,) and define J relative to this basis by 

For future use, we state two obvious facts. 

Proposition 5.3. 1. The product of two anti-unitary operators is unitary. 

2. Any anti-unitary operator u : H ^ H takes the form u = Jv, where v is unitary 
and J is an anti-unitary operator on H of the kind constructed above. 

It is an easy verification that (5.23) - (5.28) still define symmetries if u is anti¬ 
unitary. Note that in terms of the complexification Jc, eq. (5.26) should read 

Jc(a)=Mfl*M*. (5.40) 

The goal of the following sections is to show that these are the only possibilities: 
Theorem 5.4. Let H be a Hilbert space, with dim(//) > 1. 

7. Each Wigner symmetry takes the form (5.23); 

2. Each Kadison symmetry takes the form (5.24); 

3. Each Ludwig symmetry takes the form (5.25); 

4. a. Each Jordan symmetry takes the form (5.26); 

b. 7/’dim(/7) > 2, also each weak Jordan symmetry takes this form; 

5. IfdiimiH) > 2, each von Neumann symmetry takes the form (5.27); 

6. Again if dim(//) > 2, each Bohr symmetry takes the form (5.28), 

where in all cases the operator u is either unitary or anti-unitary, and is uniquely 
determined by the symmetry in question up to a phase (that is, u and u' implement 
the same symmetry by conjugation iff u' = zu, where z G T). 

As we shall see, the reason why the case 7/ = is exceptional with regard to weak 
Jordan symmetries, von Neumann symmetries, and Bohr symmetries is that in those 
cases the proof relies on Gleason’s Theorem, which fails for H = C^. 

To see this more explicitly, and also to prove the positive cases (i.e., nos. l^a) in 
a simple situation without invoking higher principles, before proving Theorem 5.4 
in general it is instructive to first illustrate it in the two-dimensional case H = C^. 


= Y,^iVi. 


(5.39) 
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5.2 The case H = C^ 


We start with some background. Any complex 2x2 matrix a can be written as 

3 


a = a(xo,xi,X 2 ,X 3 ) = 5 ^ (x^ G C); 

fi=0 


(Jo = 




(5.41) 

(5.42) 


i.e., the Pauli matrices. Furthermore, if we equip the vector space M 2 (C) of complex 
2x2 matrices with the canonical inner product (2.34), then the rescaled matrices 
(J^ = form a basis (= orthonormal basis) of the ensuing Hilbert space. 

Writing x = {xi,X 2 ,xj,), some interesting special cases are: 

• xo G R, X = i\ with V G and Xg + Vj + V2 + V3 = 1, which holds iff a G SU (2); 

• x^ G R for each /r = 0,1,2,3, which is the case iff a* = a. 

• xo = 1, X G R^, and ||x|| = 1, which holds iff a is a one-dimensional projection. 

The first case follows because SU (2) consist of all matrices of the form 

«,^GC, |a|2 + |^|2 = i. (5.43) 

The second case is obvious, and the third follows from Proposition 2.9. 

Assume the third case, so that a = e with = e* = e and Tr(e) = 1. If a linear 
map M : is unitary, then simple computations show that e' = ueu* is a one¬ 
dimensional projection, too, given by e' = with Xq = 1, x' G R^, and 

||x'|| = 1. Writing x' = Rx for some map R : ^ S^, we have 

m(x • a)u* = (Rx) ■ a, (5.44) 

where x - a — Ly=i 3 C;< 77 - This also shows that R extends to a linear isometry R : 
R^ —>■ R^. Using the formula Tr (( 7 ,( 7 ,) = 25,;, the matrix-form of R follows as 

/?,', = fTr(M(J,M* (7,). (5.45) 

Define U (2) as the (connected) group of all unitary 2x2 matrices (whose connected 
subgroup SU{2) of elements with unit determinant has just been mentioned). Also, 
recall that (9(3) is the group of all real orthogonal 3x3 matrices M, a condition that 
may be expressed in (at least) four equivalent ways (like unitarity): 

• M invertible and ^; 

• M is an isometry (and hence it is injective and therefore invertible); 

• M preserves the inner product: (Mx,My) = (x,y) for all x,y G R^. 
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This implies det(M) = ±1 (as can be seen by diagonalizing M; being a real linear 
isometry, its eigenvalues can only be ±1, and det(M) is their product). Thus 6>(3) 
breaks up into two parts (9±(3) = {RG 0(3) \ det(R) = ±1}, of which 0+ = SO{3) 
consists of rotations. Using an explicit parametrization of 56>(3), e.g., through Euler 
angles, or, using surjectivity of the exponential map (from the Lie algebra of SO{3), 
which consist of anti-symmetric real matrices), it follows that (9±(3) are precisely 
the two connected components of (9(3), the identity of course lying in (9+(3). 

Proposition 5.5. The map R defined by (5.44) is a homomorphism from U{2) 
onto 5(9(3). In terms of SU{2) C U(2), this map restricts to a two-fold covering 

jt:SU(2) ^SO(3), (5.46) 

with discrete kernel 

ker(jf) = {l 2 ,-l 2 }. (5.47) 

Proof. As a finite-dimensional linear isometry, R is automatically invertible (this 
also follows from unitarity and hence invertibility of m), hence R G 0(3). It is ob¬ 
vious from (5.44) that m n- is a continuous homomorphism (of groups). Since 
U (2) is connected and m is continuous, R must lie in the connected component 
of (9(3) containing the identity, whence R G 5(9(3). To show surjectivity of ft, take 
some unit vector u G and define m = cos (56)4-/ sin (j0)u-(J. The corresponding 
rotation R^ (u) is the one around u by an angle 0, and such rotations generate 5(9(3). 

Finally, it follows from (5.44) that u G ker(jf) iff u commutes with each ( 7 , and 
hence, by (5.41), with all matrices. Therefore, u =z -12 for some zGC, upon which 
the the condition det(M) = 1 (in that u G SU (2)) enforces z = ±1. □ 

Note that the covering (5.46) is topologically nontrivial (i.e., SU(2) ^ 5(9(3) x Z 2 ), 
since SU(2) = 5^ is simply connected, whereas 5(9(3) is doubly connected: a closed 
path 1 1 —^ /? 2 w(u), t G [0,1] in 5(9(3) (starting and ending at I 3 ) lifts to a path 

t !->• cos(nt) -G i sin(;rf )u • a 

in SU(2) that starts at the unit matrix I 2 and ends at — 12 . 

To incorporate (9_(3), let Ua(2) be the set of all anti-unitary 2x2 matrices. 
These do not form a group, as the product of two anti-unitaries is unitary, but the 
union U(2) U Ua(2) is a disconnected Lie group with identity component U (2). 

Proposition 5.6. The map uv-^R defined by (5.44) is a surjective homomorphism 

ft’ :U(2)UUa(2)~GO(3), (5.48) 

with kernel U(l), seen as the diagonal matrices z ■ li, z G T. Moreover, ft' maps 
U(2) onto 5(9(3) and maps Ua(2) onto (9_(3). 

Proof. The map u R in (5.44) sends the anti-unitary operator m = 7 on to 
= diag(l, —1,1) G (9_(3). Since Ua(2) =J-U(2) and similarly (9_(3) —R-SO(3), 
the last claim follows. The computation of the kernel may now be restricted to U (2), 
and then follows as in the last step op the proof of the previous proposition. □ 
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We now return to Theorem 5.4 and go through its special cases one by one. 

Part 1 of Theorem 5.4 is Wigner’s Theorem, which in the case at hands reads: 

Theorem 5.7. Each bijection W : (C^) —>■ (C^) that satisfies 


Tr(W(e)W(/))=Tr(e/) (5.49) 

for each e,f£ takes the form W(e) = ueu*, where u is either unitary or 

anti-unitary, and is uniquely determined by \N up to a phase. 

To prove, this we transfer the whole situation to the two-sphere, where it is easy: 

Proposition 5.8. The pure state space (C^) corresponds bijectively to the sphere 

= {{x,y,z) G = 1}, 

in that each one-dimensional projection e G (C^) may be expressed uniquely as 


eix,y,z) = i 


/ l-hz x—iy\ 

+ 1-z J ’ 


(5.50) 


where {x,y,z) G and = 1. Under the ensuing bijection 

(5.51) 

Wigner symmetries W of turn into orthogonal maps R G (9(3), restricted to S^. 

Proof The first claim restates Proposition 2.9. If \j/ and xj/' are unit vectors in 
with corresponding one-dimensional projections e^,{x,y,z) and e^^fx' ,y' ,z!) then, as 
one easily verifies, the corresponding transition probability takes the form 

Tr(e^e^/) = i(l-f (x,x')) = cos^(f 0(x,y)), (5.52) 

where 0(x,y) is the arc (i.e., geodesic) distance between x and y. Consequently, 
if W : (C^) —>• (C^) satisfies (5.8), then the corresponding map R : ^ 

(defined through the above identification (C^) = 5^) satisfies 

{R{x),R{x’)) = {x,x') (x,x'g5^). (5.53) 

Lemma 5.9. If some bijection R : ^ satisfies (5.53), then R extends (uniquely) 

to an orthogonal linear map (for simplicity also called) /? : —>■ 

Proof With (ui,U 2 ,U 3 ) the standard basis of define a 3 x 3 matrix by 


Rki = {uk,R{ui)). (5.54) 

It follows from (5.53) that R^^{\\j)k = Rjk, which implies {R^^(uj),x) — Y.k^jkXk, 
or, once again using (5.53), R(x)j = Y,k^jkXk- Hence the map x T,j,k^jkXkUj, i.e., 
the usual linear map defined by the matrix (5.54), extends the given bijection R. 
Orthogonality of this linear map is, of course, equivalent to (5.53). □ 
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Wigner’s Theorem then follows by combining Propositions 5.6 and 5.8: given 
the linear map R just constructed, read (5.44) from right to left, where u exists by 
surjectivity of the map (5.48), and the precise lack of uniqueness of u as claimed in 
Theorem 5.4 is just a restatement of the fact that (5.48) has U{1) as its kernel. □ 

Kadison’s Theorem is part 2 of Theorem 5.4. Explicitly, for H = C^ we have: 

Theorem 5.10. Each affine bijection K : ^(C^) —>■ ^(C^) is given as K(p) = upu*, 
where u is unitary or anti-unitary, and is uniquely determined by K up to a phase. 


Proof. We once again invoke Proposition 2.9, implying that any density matrix p 
on takes the form 

P = k(^h+'tx^,af}j, (5.55) 

with ||x|| < 1. Moreover, the ensuing bijection ^(C^) = B^, p i—> x, is clearly affine, 
in that a convex sums fp + (1 — t)p' of density matrices correspond to convex sums 
fx + (1 — f)x' of the corresponding vectors in K^. 

Lemma 5.11. Any affine bijection K of the unit ball eP in is given by an orthog¬ 

onal linear map R G (9(3). 

Proof. First, K must map the boundary deEP = to itself (necessarily bijectively): 
if X G 5^ and K(x) = tx' + (1 — t)'s!', then x = (x') + (1 — (x"), whence 

/:-'(x') =/:^‘(x"), (5.56) 

since x is pure, whence x' = x", so that also K(x) is pure. 

Second, the basis of all further steps is the property 

K(0) = 0. (5.57) 

This is because 0 is intrinsic to the convex structure of B^\ it is the unique point 
with the property that for any x G 5^ there exists a unique x' such that jX + ^x' = 0, 
namely x' = —x. Thus 0 must be preserved under affine bijections. For a formal 
proof (by contradiction), suppose K(0) f 0, and define y = K(0)/|| K(0)|| G 5^. Then 
K(0) has an extremal decomposition K(0) = fy + (1 — t)y', with y' = —y and t = 
5(1 + ||K(0)||). Applying the affine map then gives 


l|K-'(y')ll = l|K-'(y)ll- 


1 + I|K(0)|| 

1-||K(0)||- 


Now y G 5^ and hence K^*(y) G by part one of this proof (applied to K^*), so 
that ||K^*(y)|| = 1. But this implies ||K^^(y')|| > 1, which is impossible because 
y' G and hence ||K^^(y')|| = 1. 

Third, for x G and t G [0,1] the preceding point implies that 


K(tx) = K(tx + (1 - f )0) = f K(x) + (1 - f)K(O) = f K(x). (5.58) 
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The same then holds for x G and all f > 0 as long as tx G B^: for take f > 1, so 
that G (0,1), and use the previous step with x fx and t to compute 

K(fx) = ft^^K(tx) = fK(t^^fx) = tK(x). 

Also, (5.58) and affinity imply that for any x,y G B^ for which x + y G we have 
K(x + y) =2K(ix+iy) =2-(iK(x) + iK(y)) = K(x) + K(y). (5.59) 

With our earlier result (5.57), this also gives 

K(—x) = —K(x). (5.60) 

For some nonzero x G take s > ||x|| and t > ||x||. Then by (5.58) we have 

sK(x/s) = sK (-y) = fK(x/f). 

We may therefore define a map B —>■ by 

B(0) = 0; (5.61) 

B(x) = s-K(x/s) (xy^O), (5.62) 

for any choice of s > ||x||. For xG B^ we may take s = 1, so that R extends K. 

To prove that R is linear, for x G and t >0 pick some s > f ||x|| and compute 

«(«) = ,K (y) = .K (^||x|| 1 . ||x|| Ik = ,K(x). (5.63) 

For f < 0, we first show from (5.60) and (5.62) that 

B(-x) = -B(x), (5.64) 

upon which (5.63) gives 

R{tx) = B(|fI • (—x)) = |f|B(—x) = —|f|B(x) = —fB(x). (5.65) 

Furthermore, for given x,y G B^, pick s' > 0 such that s' > ||x|| and s' > ||y||, so that 
s = 2s' > ||x + y|| by the triangle inequality, and use (5.59) to compute 

B(x + y) = sK = sK = sK(x/s) +sK(y/s) 

= B(x)+B(y). (5.66) 

Finally, R is an isometry by (5.62) and step one of the proof. Being also linear and 
invertible, R must therefore be an orthogonal transformation. □ 
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Given step one, an alternative proof derives this lemma from Proposition 5.18 below, 
which shows that the transition probabilities (5.52) on are determined by the 
convex structure of so that affine bijections must preserve them. In other words, 
the boundary map —?> defined by K preserves transition probabilities and hence 
satisfies the conditions of Lemma 5.9. This reasoning effectively reduces Kadison’s 
Theorem to Wigner’s Theorem, a move we will later examine in general. 

In any case. Theorem 5.10 now follows from Lemma 5.11 is exactly the same 
way as Theorem 5.7 followed from the corresponding Lemma 5.9. □ 

We have given this proof in some detail, because step 3 will recur on other occasions 
where a given affine bijection is to be extended to some linear map. 

Ludwig’s Theorem is part 3 of Theorem 5.4. For H = C^, we have: 

Theorem 5.12. Each affine order isomorphism L : —>■ ^(C^) reads L(a) = 

uau*, where u is unitary or anti-unitary, and is uniquely fixed by L up to a phase. 

Proof. Using the parametrization (5.41), we have a(xo,xi,X 2 ,X 3 ) G rf(C^) iff each 
X/i is real and 0 < xo ± ||x|| < 2. In particular, we have 0 < xo < 2. This easily follows 
from (2.38), noting that a G £’(C^) just means that a* = a and that both eigenvalues 
of a lie in [0,1]. Thus is isomorphic as a convex set to a convex subset C of 

that is fibered over the xq -interval [0,2], where the fiber of C over xq is the 
three-ball with radius ||x|| = xo as long as 0 < xq < 1 , whereas for 1 < xq < 2 
the fiber is so at xq = 1 the fiber is Ci = = B\ (in one dimension less, 

this convex body is easily visualizable as a double cone in where the fibers are 
disks). The partial order on C induced from the one on is given by 

(xo,x) < (xo,x') iffxQ-xo > I|x'-x||, (5.67) 

which follows from (5.41) and (2.38), noting that for matrices one has a < a' iff 
a' — a has positive eigenvalues. A similar argument to the one proving (5.57) then 
shows that any affine bijection L of C must map the base space [0,2] to itself (as 
an affine bijection), and hence either xq i— xq or xq i—2 — xq. The latter fails to 
preserve order, so L must fix xq. Similarly, L maps each three-ball Qg to itself by 
an affine bijection, which, by the same proof as for Kadison’s Theorem above, must 
be induced by some element Rx^ of 6>(3). Finally, the order-preserving condition 
x'q — xo > jjx' —xjj ^^x'q—xq > jjBy x'—B;toX|| obtained from (5.67) and the property 
L(xo) = Xo just found can only be met if Rxg is independent of xq. □ 

Part 3 of Theorem 5.4 does not carry an official name; it may be attributed to Kadi- 
son, too, but the hard part of the proof was given earlier by Jacobson and Rickart. 
Rather than a contrived (though historically justified) name like “Jacobson-Rickart- 
Kadison Theorem”, we will simply speak of Jordan’s Theorem (for H = C^): 

Theorem 5.13. Each linear bijection J : M 2 (C)sa 372 (C)sa that satisfies (5.13) 
and hence (5.12) takes the form J(a) = uau*, where u is either unitary or anti¬ 
unitary, and is uniquely determined by j up to a phase. 


“PuJtJC. T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



136 


5 Symmetry in quantum mechanics 


Proof. First, any Jordan map (and hence a fortiori any Jordan automorphism) 
trivially maps projections into projections, as it preserves the defining conditions 
= e* = e. Second, any Jordan automorphism J maps one-dimensional projections 
into one-dimensional projections: if e G then J(e) f 0 and J(e) f I 2 , both 

because J is injective in combination with J(0) = 0 and J(l 2 ) = I 2 , respectively. 
Hence J(e) G since this is the only remaining possibility (a more sophisti¬ 

cated argument shows that this is even true for any Hilbert space H). From (5.41) 
and subsequent text, as in (5.44), by linearity of J we therefore have 

J (5.68) 

from some map R : ^ S^, which is bijective because J is. Linearity of J then 

allows us to extend R to a linear map —>■ R^, with matrix 

3 

Rjk = kL^^i<ykM(yi)), (5.69) 

>=1 

cf. (5.45). By (5.69), this linear map restricts to the given bijection R : ^ S^, 

which also shows that it is isometric. Thus we have a linear isometry on R^, which 
therefore lies in 0(3). The proof may then be completed as in Theorem 5.7. □ 

The case H = C^ was already exceptional in the context of Gleason’s Theorem, and 
it remains so as far as weak Jordan symmetries and Bohr symmetries are concerned. 

Proposition 5.14. Theposet ^{M 2 {C)) is isomorphic to {_L}URP^, where the real 
projective plane RP^ is the quotient 5^/ ~ under the equivalence relation x ~ —x, 
and the only nontrivial ordering is _L < pfor any p G RP^. 

Proof It is elementary that M 2 (C) has a single one-dimensional unital *-subalgebra, 
namely C • 1, the multiples of the unit; this gives the singleton _L in '^(M 2 (C)). 

Furthermore, any two-dimensional unital *-subalgebra C of M 2 (C) is generated 
by a one-dimensional projection e, in that C is the linear span of e and I 2 . Hence C 
is also the linear span of (the projection) I 2 — e and I 2 . In our parametrization of all 
one-dimensional projections e on by (cf. Proposition 2.9), if e corresponds to 
X, then \ —e corresponds to —x. This yields the remainder RP^ of ‘^(M 2 (C)). 

Finally, commutative unital *-subalgebras D of M 2 (C) of dimension > 2 do not 
exist. For any such algebra D would contain some two-dimensional C just defined, 
but a simple computation (for example, in a basis were C consists of all diagonal 
matrices) shows that the only matrices that commute with all elements of C already 
lie in C (i.e., are diagonal). Hence no commutative extension of C exists. □ 

Bohr symmetries B for therefore correspond to bijections of RP^. Similarly, 
weak Jordan symmetries J for corresponds to bijections of (the difference 
with Bohr symmetries lies in the fact that J may also map C = span(e, I 2 ) to itself 
nontrivially, i.e., by sending e to I 2 — e, which for B would yield the identity map). 
In both cases, few of these bijections are (anti-) unitarily implemented. 
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5.3 Equivalence between the six symmetry theorems 

If dim(H) > 1, the first three claims of Theorem 5.4 are equivalent; if dim(//) > 2, 
all claims are. We will show this in some detail, if only because the proofs of the 
various equivalences relate the six symmetry concepts stated in Definition 5.1 in 
an instructive way. We will do this in the sequence Wigner o Kadison o Jordan, 
and subsequently Jordan o Ludwig, Jordan o von Neumann, and Jordan o Bohr. 
Consequently, in principle only one part of Theorem 5.4 requires a proof. Although 
redundant, we will, in fact, prove both Wigner’s Theorem and Jordan’s (indeed, no 
independent proof of the other parts of Theorem 5.4 seems to be known!). The most 
transparent way to state the various equivalences is to note that in each case the set 
of symmetries of some given kind (i.e., Wigner,...) forms a group. In all cases, the 
nontrivial part of the proof is the establishment of a “natural” bijection, from which 
the group homomorphism property is trivial (and hence will not be proved). 

Proposition 5.15. There is an isomorphism of groups between: 

• The group of affine bijections K : ^(H) —>■ Sl{H); 

• The group of bijections W ; !^\ (H) —>■ (H) that satisfy (5.8), viz. 

where p — some (not necessarily unique) expansion of p G &{H) in terms 

of a basis of eigenvector Vi with eigenvalues Xi, where Xt > 0 and In 

particular, (5.70) and (5.71) are well defined. 

Proof. It is conceptually important to distinguish between Z?(//)sa as a Banach space 
in the usual operator norm || • ||, and Bi (//)sa, the Banach space of trace-class oper¬ 
ators in its intrinsic norm || • || i. Of course, if dim(//) < o°, then B(H)sa = Bi (i/)sa 
as vector spaces, but even in that case the two norms do not coincide (although 
they are equivalent). The proof below has the additional advantage of immediately 
generalizing to the infinite-dimensional case. We start with (5.70). 

1. Since (H) — by the same argument as in the proof of Lemma 5.11, 

any affine bijection of the convex set ^{H) must preserve its boundary, so that 
K maps 3^\ (H) into itself, necessarily bijectively. The goal of the next two steps 
is to prove that (5.70) satisfies (5.8), i.e., preserves transition probabilities. 

2. An affine bijection K : Si(H) &{H) extends to an isometric isomorphism Ki : 
Bl(77)sa-^ Bi (77)sa with respect to the trace-norm || • || i, as follows: 

a. Put Ki (0) = 0 and for b >0, b G Bi (H), i.e. bGB \(//)+, and b fO, define 

Ki(h) = ||h||iK(h/||h||i). (5.72) 


(5.70) 

(5.71) 
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By construction, Ki is isometric and preserves positivity. For b G Bi (//)+ we 
have Tr(fo) = hence G ^{H), on which K is defined. 

Linearity of Ki with positive coefficients (as a consequence of the affine prop¬ 
erty of K) is verified as in the proof of Lemma 5.11; this time, use 

fl + fi=(||a||i + ||fi||i)-(^f^ + (l-0^), (5.73) 

withf = ||a||i/(||a||i-|-||fi||i).Note that if a, G Z?i (//)+, then a G Bi (//)+. 
b. For b G Bi(//)sa, decompose b — b+ — where b± > 0; see Proposition 
A.24 (this remains valid in general Hilbert spaces). We then define 

Ki(fi) = Ki(fi+)-Ki(fi_). (5.74) 

To show that this makes Ki linear on all of suppose b = b'_^_ — b'_ 

with b'^ > 0. Then b\ + =b+-\- b '_, and since each term is positive, 

by the previous step. Hence so that 

(5.74) is actually independent of the choice of the decomposition of b as long 
as the operators are positive. Hence for a,b G Bi (//)sa we may compute 

Ki(fl + fi) = Ki(fl+ +b+ — (a_ +b-)) = Ki(fl+ +b+) — Ki(fl_ +fi_) 

= Ki(a+) + Ki(fi+)-Ki(fl_)-Ki(fi_) = Ki(a) + Ki((fi), 

since a+ + and are both positive. 

The key point in verifying isometry of Ki is the property \b\ = b+ + b-, which 
follows from (A.76) or Theorem B.94. Using this property, we have 

||Ki(fi)||i =Tr(|Kifi|)=Tr(|Ki(fi+)-Ki(fi_)|)=Tr(Ki(fi+) + Ki(fi_)) 

= Tr(fi++fi_)=Tr(|fi+-fi_|)=Tr(|fi|) = ||fi||i. 

3. For any two unit vectors \j/,(p in H we have the formula 

lkv/-e,p||i =2^\-Tr{exf,etf,), (5.75) 

which can easily be proved by a calculation with 2x2 matrices (since everything 
takes place is the two-dimensional subspace spanned by \j/ and (p, expect when 
(p = zyf, z G T, in which case (5.75) reads 0 = 0 and hence is true also). Since Ki 
is linear as well as isometric with respect to the trace-norm, we have 

||Ki(ey/) — Ki(e,p)||i = || Ki — eip)|| i = ||e^ — 

and hence, by (5.75), Tr(Ki(e,|/)Ki(e,p)) =Tr(e^e,p).Eq. (5.70) then gives (5.8). 
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We move on to (5.71). The main concern is that this expression be well defined, 
since in case some eigenvalue A > 0 of p is degenerate (necessarily with finite mul¬ 
tiplicity, even in infinite dimension, since p is compact), the basis of the eigenspace 
Hx that takes part in the sum is far from unique. This is settled as follows: 

Lemma 5.16. Let W : (H) —>■ (H) be a bijection that satisfies (5.8), let LcH 

be a (finite-dimensional) subspace, and let (vj) and (it/) be bases ofL. Then 

EW(e„,)=^W(e„,). (5.76) 

i i 

Proof. As usual, for projections e and f on H we write e < / iff eH C fH. From 
(B.212) and (B.214) we have < 1 for any unit vector \j/ G H, with 

equality iff xj/ G L. In other words, < e^ iff LyTr/guj^i^) = 1. Furthermore, by 
(5.8) the images remain orthogonal; hence is a projection, and 

e < Lj W(eu^.) iff ^^ Tr/W/euj je) = 1. By (5.8), this condition is satisfied for e = 
W(euJ, so that W(e„/) < for each j. Since also the projections W(e'^.) 

are orthogonal, this gives Lj W/e'^.) < Lj W(euj). Interchanging the roles of the two 
bases gives the converse, yielding (5.76). □ 

Finally, to prove bijectivity of the correspondence K o W, we need the property 



= E^;K(eu,.), 

i 


(5.77) 


since this implies that K is determined by its action on C S>(H). In finite 

dimension this follows from convexity of K, and we are done. In infinite dimension, 
we in addition need continuity of K, as well as convergence of the sum 
not only in the operator norm (as follows from the spectral theorem for self-adjoint 
compact operators), but also in the trace norm: for finite n,m, 

mm m 

iiEv„,iii<ElA,'iiku,iii = EA,, 

i=n i=n i=n 

since ||euj|i = 1. Because = 1, the above expression vanishes as n,m —>■ oo, 
whence p„ = Y 4 =\is a Cauchy sequence in B\(H), which by completeness of 
the latter converges (to an element of &{H), as one easily verifies). 

The proof of continuity is completed by noting that K is continuous with respect 
to the trace norm, for it is isometric and hence bounded (see step 2 above). □ 

It is enlightening to give a rather more conceptual proof that satisfies (5.8), 

which is based on a result to be used more often in the future. In what follows, for 
any convex set C, the notation Ai,{K) stands for the real vector space of bounded 
affine functions / : C —K, that is, bounded functions satisfying 

f{tx+{\-t)y) =tf(x)p{\-t)f(y), x,yGCd G (0,1). (5.78) 

It is easily checked that Afo(fG) with the supremum-norm is a real Banach space. 
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Proposition 5.17. For any Hilbert space H we have an isometric isomorphism 

Ai,{&{H)) ^ B{HU, (5.79) 

f a; (5.80) 

/(p) = Tr(pa), (5.81) 


which preserves the unit (i.e., 1 o 1//) as well as the order (i.e, f>0 iff a >0). 

Note that under the identification St{H) ~ Sn{B{H)) (where in finite dimension the 
normal state space Sn{B{H)) simply coincides with the state space S{B(H))), where 
p O 0 ) as in (2.33), i.e., co{a) = Ti (pa), the above isomorphism simply reads 

Ab(S„(B{H))) ^ (5.82) 

a a-, (5.83) 

d((o) = (o{a). (5.84) 

Proof. It is clear that for each a G B{H)sii the function f : p Ti (pa) (or, equiv¬ 
alently, d : 0) (o(a)) is affine as well as real-valued, and is bounded by (A.100) 

(supplemented, if dim(i/) = by Lemma B.142), noting that ||p||i = 1 for p G 
S>(H), and in fact (B.483) yields the equality ||/||»o = ||a|| (or ||d||oo = ||a||). 
Conversely, / G Ab(&(H)) defines a function Q : R by 

e(0) = 0; (5.85) 

Qiw) = \W\?f{e^l\M) (¥^0). (5.86) 

This function is clearly bounded on the unit ball of H, as in 

\Qi¥)\<\\fU\¥f- (5.87) 


To check that Q in fact defines a quadratic form on H, we verify the properties (A.8) 
- (A.9). The first is trivial. The second follows from the easily verified identity 


te v+w -f ( 1 — t)e V—w — se _r_ -t- (1 — .y)g w 


where v,w f^,v fw, and the coefficients sf are given by 


, l|v+HP 

2(||v|P + |MP)’ 
IHP + iMp- 


(5.88) 


(5.89) 

(5.90) 


The affine property (5.78) then immediately yields (A.9). According to Proposition 
B.79, we obtain a unique operator a G B(//)sa such that Q(il/) = {xj/jaxj/), i.e.. 


{\l/,a\l/)=f(ey,),\l/GH,\\\i/\\ = l. (5.91) 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



5.3 Equivalence between the six symmetry theorems 


141 


Since also {\j/,a\j/) = Tr(e^fl), we have established (5.81) for each p = gy,, where 
1// G //, II V/|| = 1. To extend this result to general density operators p = we 

use (A. 100) as well as convergence of the above sum in the trace norm 11-111,0^ the 
proof of Lemma 5.16; the details are analogous to the proof of Theorem B.146. □ 

Proposition 5.18. For any unit vectors \j/,(p G H we have 

Tr(gy,gy,) = inf{/(gy,) | / G Ab{9{H)),Q <f< l,/(gy,) = 1}. (5.92) 

The virtue of this formula is that the expression on the left-hand side, which defines 
the transition probabilities on de9{H) = !9\ (H), is intrinsically given by the con¬ 
vex structure of 9{H). Consequently, any affine bijection of this convex set (which 
already preserves the boundary) must preserve these probabilities. 

Proof. By the previous proposition, eq. (5.92) is equivalent to 

Tr(ey/ey,) = mf{{\j/,a\j/) \ a G B(//)sa,0 <a< l,{(p,a(p} = 1}. (5.93) 

Since Tr(ey/ey,) = (v/,ey,v/), we are ready if we can show that the infimum is 
reached at a = gy,. Therefore, we prove that for any a as specified we must have 
(V/,av/) > Tr(gy,g|j,) = |(^, y/)p. To do so, we are going to find a contradiction if 

(V/,aV/) < Tr(gy,gy,), (5.94) 

for some such a. Indeed, ((p,a(p) — 1 with ||a|| < 1 (which follows from 0 < a < 1) 
and II ^11 = 1 imply, by Cauchy-Schwarz, that a(p = (p. Since a* = a (by positivity of 
fl), we also have a: {C- (p)^ —> (C • (p)^, so we may write a = gy) -t-a', with a'(p — 0 
and a' mapping (C • (p)-^ to itself. Then a > 0 implies a' > 0. If (5.94) holds, then 
{y/, a'yf) < 0, which contradicts positivity of a' (and hence of a). □ 

We now turn to the equivalence between Jordan’s Theorem and Kadison’s Theorem. 

Proposition 5.19. There is an isomorphism of groups between: 

• The group of affine bijections K : 9{Fl) —>■ 9{H); 

• The group of Jordan automorphisms J ; B(//)sa —>■ 

such that for any a G B(//)sa one has 

Tr(K(p)fl)=Tr(pJ(a)) {p € 9{H)). (5.95) 

This immediately follows from the following lemma (of independent interest): 
Lemma 5.20. 1. There is a bijective correspondence between: 

• affine bijections K : 9{H) —>■ 9{H); 

• unitalpositive (i.e. order-preserving) linear bijections (X : B{H)sii —>■ B{H) 

sa? 

such that for any a G B(//)sa one has (5.95). 

2. A map a : B{H) —>■ B(H) is a unital positive linear bijection iff it is a Jordan 
automorphism. 
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Proof. 1. An affine bijection K : S>{H) &(H) induces an isomorphism 

K*: Ah{^{H)) ^ Ah{^{H))- (5.96) 

/^/oK, (5.97) 

which is evidently unital, positive, and isometric. Consequently, by Proposition 

5.17, K* corresponds to some isomorphism a ; B(H)sa which neces¬ 

sarily shares the properties of being unital, positive, and isometric; this follows 
abstractly from the proposition, but may also be verified directly from (5.95). 
Conversely, such a map a yields a map K directly by (5.95); to see this, we 
identify &{H) with the normal state space of B(H) through p -H- co, as usual, cf. 
(2.33), and note that Ko) is the state defined by (Ka))(a) = a>(a(a)), or briefly 
Ko) = (Ooa. This is often written as K = a*, and for future reference we write 

a*co(a) = co(a(a)). (5.98) 

2. The nontrivial direction of the proof (i.e. positive etc. ^ Jordan) is based on a 
number of facts from operator theory: 

a. Unital positive linear maps maps on B{H)^^ preserve cf. (2.164). 

b. Any two projections e and / are orthogonal (ef = 0) iff e + / < 1h (easy). 

c. Any a € Z?(//)sa is a norm-limit of finite sums of the kind A,e,, where A, G R 
and the e, are mutually orthogonal projections (this follows from the spectral 
theorem for bounded self-adjoint operators in the form of Theorem B.104) 

d. Any unital positive linear map a : B(H)sa -G- B(//)sa is continuous. Since 

-||fl|| • 1 h < a < -||a|| -Ih (a G B(//)sa), (5.99) 

by (C.83), applying the positive map a and using o:(l//) = 1// yields 
-||a|| • 1h < a{a) < -||a|| • Ih- 

This is possible only if ||o:(fl)|| < ||a||, and hence a is continuous with norm 
bounded by ||a|| < 1. In fact, since a is unital we have ||a|| = 1. 

Therefore, any unital positive linear map a preserves orthogonality of projec¬ 
tions, so if a = XiCi (finite sum), then 

a(fl^) = a I Y^Xfet | = Y^Xfa{ei) = J^XiXja{ei)a{ej) = a{af, (5.100) 

\ i / i ij 

since etej = dijOj and by the above comment also a{ei)a{ej) = 5,ya(e,). By 
continuity of a, this property extends to arbitrary a G B{H)sa. Finally, since 

aob = j{{a + b)^— a^— b^), (5.101) 

preserving squares as in (5.100) implies preserving the Jordan product o. □ 
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We now turn to the equivalence between Ludwig symmetries and Jordan ones. 
Proposition 5.21. There is an isomorphism of groups between: 

• The group of affine order isomorphism L : <S{H) —>■ 

• The group of Jordan automorphisms J ; B(//)sa ^ B{H) 

sa- 

Proof Since L is an order isomorphism, it satisfies L(0) = 0 (as well as L(l//) = 
Ih), since 0 is the bottom element of ^(H) as a poset (and 1// is its the top element). 
As in the proof of Lemma 5.11, one shows that this property plus convexity implies 
L(fa) = f L(a) and L(a + b) = L(a) + L(fi) whenever defined. Defining J by 

J(0) = 0; (5.102) 

J(a) = s- L(a/s) (a > 0,s > ||a||); (5.103) 

J(a) =-J(-fl) (a<0), (5.104) 

where a > 0 means a >0 and a f 0, and a < 0 means —a > 0 and a fO, once 
again the reasoning near the end of the proof of Lemma 5.11 shows that J is linear; 
it is a untital order-preserving bijection by construction. Hence J is a Jordan auto¬ 
morphism by Lemma 5.20.2 Of course, instead of (5.104) one could equivalently 
have defined J on general a G B{H)sa by J(a) = J(a+) — J(a_), using the (by now 
hopefully familiar) decomposition a = a+ — a- with a± > 0 and a+a_ = 0. 

Conversely, once again using Lemma 5.20.2, a Jordan automorphisms (5.11) pre¬ 
serves order as well as the unit, so that the inequality 0 < a < \h characterizing 
a € S'{H) is preserved, i.e., 0 < J(fl) < 1//. Thus J preserves S’{H), where it pre¬ 
serves order. Convexity is obvious, since L = comes from a linear map. □ 

The equivalence between Jordan’s Theorem and von Neumann’s Theorem (provided 
dim(//) > 3) hinges on the following corollary of Gleason’s Theorem (cf. §D.l). 

Corollary 5.22. Let dim(i/) > 2. Then an isomorphism N of as an ortho- 

complemented lattice has a unique extension to a linear map a : B(//)sa -5- B(H) 

sa? 

which is (automatically) invertible, unital, and positive. 

Proof According to Lemma D.2, N preserves all suprema in 3^(H). Since we have 
Li = V for any family of mutually orthogonal projections and since N by defi¬ 
nition preserves the orthocomplementation e^ = \—e and hence preserves orthog¬ 
onality of projections, we may compute 




= \lH(ei)=Y^H{ei). 
i i 


(5.105) 


Consequently, for any normal state co on B{H), the map e oo N(e) is a probability 
measure on which by Gleason’s Theorem has a unique linear extension to 

B{H) and hence a fortiori to B(H)sa. We use this in order to define a, as follows. 

First, let a G B{H)sa and suppose a = Y^j^jfj for some finite family (fj) of pro¬ 
jections (not necessarily orthogonal), and some A/ G M. Then Ej‘^/^(fj) is inde¬ 
pendent of the particular decomposition of a that has been chosen, so we may put 
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a(a)=^A,N(/,). (5.106) 

.1 

To see this, put a = Y,/ hence a'(a) = N (//)> suppose a'(a) ^ 

a{d). By (B.477) there exists a normal state o) such that a)(a'(a)) ^ co{a{a)); 
indeed, each element of Bi (H) is a linear combination of at most four density op¬ 
erators, so that each normal linear functional on B{H) is a linear combination of at 
most four normal states. But since O) o N is linear, this implies O) o N(a) (B o N(a), 
which is a contradiction. Hence a'(a) — a(a) and accordingly, (5.106) is well de¬ 
fined. Because it is independent of the decomposition of a into projections, a is 
linear: if a = Xjfj and a' = Y./ then a + a' = Xjfj -f so that 

N(fl + fl') = N I =EA,N(/,)+E4N(/',) = N(fl) + N(fl'). 

V j f J J ./■' 

Similarly, for any f G R we have 

N{ta) = N = Ef^;N(/y) = fE^;N(/,) = fN(a). 

We may now extend a to all of B{H)sa by continuity. Indeed, according to the 
spectral theorem in the form (B.326), the set of all operators of the form a = Y,j Xjfj 
with all fj mutually orthogonal (so that a is given by its spectral resolution) is norm- 
dense inB(//)sa. Applying (5.106), and noting that ||fl|| = sup^ \Xj\, we may estimate 

||a(a)|| = ||E^.N(/,)II < sup{|A,-|}||EN(/,-)|| < ||a|l, 

j j j 

since the N(/j) are mutually orthogonal and hence sum to some projection, which 
has norm 1 (unless a = 0). For general a G B(H)sa, we may therefore define N by 
N(fl) = lim„ N(fl„), where each a„ is of the above (spectral) form and ||a„ — fl|| —0. 

To prove that a is positive, we show that a{a)>Q whenever a > 0. As in the pre¬ 
ceding step, initially suppose that a = Y,j Xjf j has a finite spectral resolution. Then 
a > 0 iff Xj > 0 for each j, and hence (x{a) > 0 by (5.106), since by orthogonality 
of the N(/,) this equation states the spectral resolution of a{a). Now if a„ > 0 and 
On ^ a (in norm), then (\j/,a„Y} —^ ('/)«</), which must remain positive, so that 
a > 0. Hence positivity of a on all of B(H)sa follows by continuity. 

Finally, a inherits invertibility from N, and it is unital by (5.105), taking e, = 
|u,)(B,j for some basis (Vi) of H (or using the fact that it preserves T = 1//). □ 

Subsequently, we use Lemma 5.20 to further extend a by complex linearity to a 
Jordan isomorphism of B{H)-, see Definition 5.1. 

Finally, the equivalence between weak Jordan symmetries and Bohr symmetries 
follows from Hamhalter’s Theorem 9.4, whereas Theorem 9.7 strengthens this to an 
equivalence between Jordan symmetries and Bohr symmetries. The proof of these 
theorems does not seem to simplify in the special case at hand, i.e. A = B{H). 
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5.4 Proof of Jordan’s Theorem 

In view of the equivalence between the six parts of Theorem 5.4, we only need to 
prove one of them. In the literature, one only finds proofs of Jordan’s Theorem and 
of Wigner’s Theorem, and we present each of these (surprisingly but instructively, 
these proofs look completely different). We start W\\h Jordan’s Theorem: 

Theorem 5.23. Any Jordan automorphism Jc ofB{H) is given by either 

^c{a) = au{a) = uau*, (5.107) 

where u is unitary (and is determined by Jc up to a phase), or by 

Jc(a) = a'(a) = Ma*M*, (5.108) 

where u is anti-unitary (and is determined by Jc up to a phase, too). 

The difficult part of the proof is Theorem C.175, which implies: 

Proposition 5.24. A Jordan automorphism a ofB{H) is either an automorphism or 
an anti-automorphism. 

Recall that an automorphism of B{H) is a linear bijection a : B{H) —^ B{H) that sat¬ 
isfies a(a*) = a(a)* and a(ab) = a(a)a(b); an anti-automorphism, on the other 
hand, satisfies the first property whilst the latter is replaced by a{ab) = a{b)a{a). 
Clearly, both automorphisms and anti-automorphisms are Jordan automorphisms. 
Granting this result, we may deal with the two cases separately. 

Proposition 5.25. Any automorphism a : B{H) —J B{H) takes the form a — a„, see 
(5.107), where u : H ^ H is unitary, uniquely determined by a up to a phase. 

The proof uses the following lemmas. The first follows from Theorem C.62.4. 
Lemma 5.26. If a : B(H) —^ B{H) is an automorphism and a G B(H), then 

||a(a)|| = ||a||. (5.109) 

Lemma 5.27. If a : B{H) —s- B{H) is an automorphism and e G B{H) is a one¬ 
dimensional projection, then so is (x{e). 

Proof. It should be obvious that automorphisms a preserve projections e (whose 
defining properties are e^ = e* = e). Furthermore, a preserves order, i.e., if a > 0 
(in that, as always, {xj/jaxj/) > 0 for each xj/ G H, or, equivalently, a = b*b), then 
a(a) >0 (this is clear from the second way of expressing positivity). Consequently, 
if a < b (in that b — a>Q), then a (a) < a{b). We notice that if we define e </iff 
eH C fH, then e < / iff e < / as self-adjoint operators (in that (y/,ey/) < iwJw) 
for each \j/ G //); see Proposition C.170. With respect to the ordering < the one¬ 
dimensional projections e are atomic, in the sense that 0 < e (but e fO) and if 0 < 
f <e, then either / = 0 or / = e. Now automorphisms of the projection lattice B(H) 
restrict to isomorphisms of which preserve atoms (as these are intrinsically 

defined by the partial order). □ 
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We are now ready for the (constructive!) proof of Proposition 5.25. 

Proof. For some fixed unit vector x & H, take the corresponding one-dimensional 
projection and define a new unit vector (p (up to a phase) by 

e^ = a-\ex). (5.110) 

Now any y/ G H may be written as y/ = a(p, for some a G B{H). Attempt to define 
an operator u by uy/ — a{a)x, i.e., 

ua(p^a{a)x. (5.111) 

This looks dangerously ill-defined, since many different operators a may give rise 
to the same y/. Fortunately, we may compute 

\\a(p\\H = \\ae^(p\\H = \\ae^\\B^H) = \\cc{ae^)\\B{H) 

= \\a{a)a{e^)\\B(H) = \\a{a)ex\\B(H) = ll«(a)zll// 

= \\uay)\\H, 

so that if a(p = by), then a{a)x = Ct(^)Z and hence u is well defined. By this 
computation u is also isometric and since it is clearly suijective, it is unitary. The 
property a{a) = uau* is equivalent to ua = a{a)u, which in turn is equivalent to 
uab(p — a{a)ub(p for any b G B{H), which by definition of u is the same as 

a{ab)x = a{a)a{b)x. (5.112) 

But this holds by virtue of a being an automorphism. Finally, all arbitrariness in u 
lies in the lack of uniqueness of (p given its definition (5.110). □ 

Proposition 5.28. Any antiautomorphism a : B{H) -5- B{H) takes the form a = a„, 
cf (5.108), where u : H H is anti-unitary, uniquely determined by (X up to a phase. 

Proof. Pick an arbitrary anti-unitary operator J : H -G H and define 

j3 -. B{H) ^ B{H)-, 

li{a)=Ja*J*. (5.113) 

Then a o j3 is an automorphism, to which Proposition 5.25 applies, so that 

a o p (a) = iiau*, (5.114) 

for some unitary u. Hence 

a(fl) = a(P oj3^'(fl)) = aop{J*a*J) = uJ*a*Ju*, 
so that a{a) = ua*u* with u = uJ*. 

The precise lack of uniqueness of u is inherited from the unitary case. □ 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



5.5 Proof of Wigner’s Theorem 


147 


5.5 Proof of Wigner’s Theorem 

We recall Wigner’s Theorem, i.e. Theorem 5.4.1: 

Theorem 5.29. Each bijection W : [H) —>■ (H) that satisfies 

Tr(W(e)W(/))=Tr(e/), (5.115) 

is given by W(e) = ueu* = o:„(e), where the operator u is either unitary or anti¬ 
unitary, and is uniquely determined by \N up to a phase. 

The problem is to lift a given map W : (H) —>■ !^\ (H) that satisfies (5.115) to 

either a unitary or an anti-unitary map u:H ^ H such that 

W(ey,) = (5.116) 

Suppose W(e^) = e^/. Since = Sy, for any z G T, and likewise for this means 
that u\j/ = zy/' for some z G T; the problem is to choose the z’s coherently all over 
the unit sphere of H. There are many proofs in the literature, of which the following 
one—partly based on an earlier proof by Bargmann (1964)—has the advantage of 
making at least the construction of u explicit (at the cost of opaque proofs of some 
crucial lemma’s). We assume dim(i/) > 2, since H = C^ has already been covered. 

Fix unit vectors xj/ G H and \j/' G W(ey/)//; clearly, \j/' is unique up to multiplica¬ 
tion by z G T, whose choice turns out to completely determine u (i.e., the ambiguity 
in \j/' is the only one in the entire construction). For a modest start, we put 

u\j/=\j/'. (5.117) 

Lemma 5.30. IfV <Z H is a k-dimensional subspace (where k <°°}, then there is a 
unique k-dimensional linear subspace V' <Z H with the following property: 

For all unit vectors yf G H, we have yr GV iff\N{ey,)H C V'. 

Proof. Pick a basis (ui,..., Vk) of V and find unit vectors v- G H such that vl G 
\N{evfiH, i= l,...,k. Then, using (5.115) we compute 

|(t;>')|2 = Tr(^:„,^:,p = Tr(W(^:„.)W(^:„,)) = Tvie^^e^.) = |(t;,-,t;,)p = 

so that the vectors (i)[,...,D^) form an orthonormal set and hence form a basis 
of their linear span V'. Now, as mentioned below (B.214), we have y/ G V iff 
l(^<> 1 similarly y/' G V iff Lf=i W')\^ = 1- Since W preserves 

transition probabilities, a computation similar to one just given yields 

k k 

£|(t;,-,v/)P = £|(u'V)P, (5.118) 

/=1 /=1 

so that both sides do or do not equal unity, and hence y/ G V iff G V'. □ 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



148 


5 Symmetry in quantum mechanics 


Wigner’s Theorem for H = C^ (i.e. Theorem 5.7) implies: 

Lemma 5.31. IfV and V' are related as in Lemma 5.30, and 

dim(y) = dim(y') = 2, (5.119) 

then there is a unitary or anti-unitary operator uy '.V ^V' such that 

\N{e) = uveu*y, (5.120) 

for any one-dimensional projection e G (y), where (y) C !^\ (H) consists of 

all e G (H) with eH C V. Moreover, uy is unique up to a phase. 

Proof. A choice of basis for both V and V' gives unitary isomorphisms u:V ^ 
and u' \V' ^ C^, which jointly induce a map 

W = u'\Nu-^ : (5.121) 

This maps satisfies the hypotheses of Wigner’s Theorem in t/ = 2, and so it is (anti-) 
unitarily induced as W' = tty, where v: is (anti-) unitary. Then the operator 

Uy = does the job; its lack of uniqueness stems entirely from v. □ 

Lemma 5.32. Given a Wigner symmetry W, the ensuing operator uy is either uni¬ 
tary or anti-unitary for all two-dimensional subspaces V <Z H (simultaneously). 


Proof. We first design a “unitarity test” for W. Define a function 

T : 3^1 (H) X (H) x (H) C; (5.122) 

T{eJ,g)=Tv{efg), (5.123) 

T{e^^,e^^,e^f, = (y/i,r2)(V^2, V^3)(V^3, V^i)- (5.124) 

Let y C // be two-dimensional and pick an orthonormal basis (ui, D 2 ). Define 

Xi=Vu X 2 = {vi-V2)/V2, X 3 = (vi-iV2)/V2. (5.125) 

A simple computation then shows that 

T{exuex2^ex3) = \i^+‘)- (5.126) 

It follows from (5.124) that for u unitary and v anti-unitary, we have 

T{£u\fii 7^u\ir23u\f/j) = T(e\fi.^ ,e\i/2,e\f/^)', (5.127) 

T : ^v\jl23v\f/2) — (5.128) 

Eq. (5.126) implies that if W : y —y' is (anti-) unitarily implemented, we have 

T{^N(e^^ ),\N{e2,2),^{e2,2)) = T(eux ,,) = K1 ± 0, (5-129) 
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with a plus sign if u is unitary and a minus sign if u is anti-unitary. Now take a 
second pair {V,V') as above, and pick a basis (Ci, V 2 ) of V, with associated vectors 
(Zi 7 X 27 X 3 ), as in (5.125). Suppose u:V ^V' implementing W is unitary, whereas 
u:V ^V' implementing W is anti-unitary. It then follows from (5.129) that 

r(W), W(e;^2), W(e;^3)) = r , £^^2 7 Suxi ) = 3 (1 + 0i (5.130) 
T),\Nie^,),\N{ex3)) = T{eax,,eux2 ,) = 3 (1 " 0• (5.131) 

In view of (C.637), the following expression defies a metric d on !^\ (H): 

d{e\fi,e(f,) = II Wi;/ — ©(pi I = ll^ip — e(p|| 1 = 1 — | (^, i/) P 7 (5.132) 

with respect to which both W and T are continuous (the latter with respect to the 
product metric on (H)^, of course). Let 1 1 —(ui (t),V 2 (t)) be a continuous path of 
orthonormal vectors (i.e., inffx H), with associated vectors {xi {t),X2it),X3if))^ as 
in (5.125). Then the function /(f) = T{\N{xi{t)),\N{x2{t)),^{X3it))) is continu¬ 
ous, and by (5.129) it can only take the values ^(1 ±/). Hence /(f) must be constant. 
However, taking a path such that (ui(0), 1 ) 2 ( 0 )) = (ui,D 2 ) and (ui(1), 1 ) 2 ( 1 )) = 
(til, V 2 ), gives /(O) = 5 (1 -f /) and /(I) = ^ (1 — /), which is a contradiction. □ 

Lemma 5.33. Wigner’s Theorem holds for three-dimensional Hilbert spaces. 

Proof. Let (ui, D 2 , tts) be some basis of of H (like the usual basis of H = C^). We 
first show that if W is the identity if restricted to both span(t)i, D 2 ) and span(t)i, D 3 ), 
then W is the identity on H altogether. To this end, take \j/ = ^, C(t),, initially with 
Cl G ]R\{0}. Take a unit vector y/' G W(eip), with y/ = Y.ic\Vi. By the first assump¬ 
tion on W we have | (i), y/') | = | (i), y/) | for any unit vector v G span(i)i, D 2 ). Taking 

t) = t)i, v = V2, V = (vi-I-V2)/V2, V = (vi-l-iV2)/V2, (5.133) 

gives the equations 

kl| = kl|, |C 2 | = |C 2 |, ki-fC 2 | = |ci-fC 2 |, kl -© 2 I = kl-© 2 I 7 (5.134) 

respectively. By a choice of phase we may and will assume c\ = ci, in which case 
the only solution is C 2 = c '2 (geometrically, the solution c '2 lies in the intersection 
of three different circles in the complex plane, which is either empty or consists 
of a single point). Similarly, the second assumption on W gives C 3 = C 3 , whence 
xj/' = \j/. The case ci = 0 may be settled by a straightforward limit argument, since 
inner products (and hence their absolute values) are continuous onH x H. 

Given a Wigner symmetry W : 3^^ (H') (//), we now construct u as follows. 

1. Fix a basis (ui ,V 2 ,V 2 ,) with “image” (i)(, 1 ) 2 , D0 under W, i.e, \N{ey.) = e^i. 

2. The unitarity test in the proof of Lemma 5.32 settles if the operators should be 
chosen to be unitary or anti-unitary; for simplicity we assume the unitary case. 

3. Define a unitary u\ ;//—>// by u\v[ = l), for i = 1,2,3, and subsequently de¬ 
fine Wi = a„j oW, which (being the composition of two Wigner symmetries) 
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is a Wigner symmetry. Clearly, Wi(euJ = (i = 1,2,3), so that Wi maps 
(^( 12 )) to itself, where ^f(i 2 ) = span(Di, D 2 ). Hence Lemma 5.31 gives a uni¬ 
tary map Ml : f/(i 2 ) ^( 12 ) stich that the restriction of Wi to f/{i 2 ) is ctgj. 

4. Define a unitary M 2 : ^ by M 2 = Mj^^ on ^f(i 2 ) and U 2 V$ = D 3 , followed by the 

Wigner symmetry W 2 = a „2 °^i- construction, W 2 (eu, ) = ^u,- for i= 1,2,3) 
(W 2 is even the identity on <53^1 (f/(i 2 ))), so that W 2 maps (//( 13 )) to itself, 
where f/(i 3 ) = span(ui , 1 ) 3 ). Hence the restriction of W 2 to f/(i 3 ) is implemented 
by a unitary M 2 : H^(i 3 ) ^(13)’ whose phase maybe fixed by requiring M 2 tti = Ui. 

5. Similarly to M 2 , we define M 3 : // - 3 ^ // by M 3 = Mj * on ^f(i 3 ) and M 31)2 = tt 2 , so 
that M 3 is the identity on f/(i 2 )- Of course, we now define a Wigner symmetry 

W 3 = a „3 o W 2 = a „3 o a „2 o a„j o W, (5.135) 

which by construction is the identity on both ^i(^f(i 2 )) and (//( 13 )), and so 
by the first part of the proof it must be the identity on all of 3^\ (H). Hence 

W = a -1 oa -1 oa -1 = a„ (m = m7'm4*m3 *). □ 

U j ^3 '' I ^ J ' 

Lemma 5.34. As in Lemma 5.30, ifdm\{V) — dim(y') = 3, then there is a unitary 
or anti-unitary operator uy - V ^ V' such that W(e) = uyeuy for any e G (V), 

Proof. Given Lemma 5.33, the proof is practically the same as for Lemma 5.31. □ 

We now finish the proof of Wigner’s Theorem. We assume that the outcome 
of Lemma 5.32 is that each uy is unitary; the anti-unitary case requires obvious 
modifications of the argument below. The first step is, of course, to define m(Ai^) = 
Xuxj/, X G C (so this would have been Xuif/ in the anti-unitary case). Let (p G H he 
linearly independent of \j/ and consider the two-dimensional space V spanned by \j/ 
and (p. Define u{(p) = uy(p. With (5.117), this defines u on all of H. To prove that 
M is linear, take (pi and (p 2 linearly independent of each other and of \j/, so that the 
linear span V 3 of \j/, <pi , and (p 2 is three-dimensional. Let V) be the two-dimensional 
linear span of \j/ and (pi, i = 1,2. Then u(pi = uy.tpi, where the phase of uy. is fixed 
by (5.117). Let w : >3 —>■ be the unitary that implements W according to Lemma 
5.33.2, with phase determined by (5.117). Since uy^ and uy^ and w are unique up 
to a phase and this phase has been fixed for each in the same way, we must have 
uy^ = W|Vj and mv 2 = Wjy^. Finally, we have yi 2 spanned by \j/ and (pi -f (p 2 , and by 
the same token, uy^^ = w^y^^. Now w is unitary and hence linear, so 

ui(pi + (p2) = Uy^2i(Pl+(P2)=w{(pl-\-(p2)=w{(pl)-\-w{(p2) 

= Uy^i(Pl)+UV2i(p2) = u((pi)-\-u{(p2), 

since this is how u was defined. Since each uy is unitary, so is m, and similarly it is 
easy to verify that u implements W, because each uy does so. □ 
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5.6 Some abstract representation theory 

Since all symmetries we have considered (named after Wigner, Kadison, Jordan, 
Ludwig, von Neumann, and Bohr) are implemented by either unitary or anti-unitary 
operators, which are determined (by the given symmetry) only up to a phase z G T, 
the quantum-mechanical symmetry group of a Hilbert space H is given by 

= {U{H)\JUa{H))/T, (5.136) 

where U{H) is the group of unitary operators on H, and Ua{H) is the set of anti¬ 
unitary operators on //; the latter is not a group (since the product of two anti- 
unitaries is unitary) but their union is. Furthermore, T is identified with the normal 
subgroup T = T-1// = {z-1//|zGT} ofU{H)\JUa{H) (and also of U (//)) consist¬ 
ing of multiples of the unit operators by a phase; thus the quotient is a group. 

The fact that rather than U (H) is the symmetry group of quantum mechanics 
has profound consequences (one of which is our very existence), which we will 
study from §5.10 onwards. However, this material relies on the theory of “ordinary” 
(i.e., non-projective) unitary representations, which we therefore review first. 

Namely, let G be a group. In mathematics, the natural kind of action of G on a 
Hilbert space // is a unitary representation, i.e., a homomorphism 

u:G^U{H), (5.137) 

so that u{x)^^ = = u{x)* and u{x)u{y) = u{xy), which imply u{e) = 1//. 

As to the possible continuity properties of unitary representations in case that 
G is a topological group (i.e., a group G that is also a topological space, such that 
group multiplication G x G ^ G and inverse G ^ G are continuous), one should 
equip U(H) with the strong operator topology (as opposed to the norm topology). 

Proposition 5.35. Ifu \x^ u(x) is a unitary representation of some locally compact 
group G on a Hilbert space H, then the following conditions are equivalent: 

1. The map G x H ^ H, (x, y/) u(x)\l/, is continuous; 

2. The map G (H), x i—>■ u{x), is continuous in the strong topology on U (H). 

Proof Strong continuity means that if xx ^ x in G, then for each xj/ G H we have 
\\(u{xx) —u{x))\l/\\ 0. This is clearly implied by the first kind of continuity, giving 

1 2, so let us prove the nontrivial converse. Suppose xx ^ x and > y/; since 

G is locally compact, x has a compact neighborhood K and we may assume that 
each xx G K.lf u is strongly continuous, then for any (p G H the set {u{y)(p,y G K} 
is compact in H and hence bounded. The Banach-Steinhaus Theorem B.78 gives 
boundedness of the corresponding operator norms, that is, { || M(y) || ,y G K} < Ck for 
some Ck > 0. We now estimate 

\\u{xx)y/^l - u{x)\l/\\ < \\u{xx)\|/^l-u{xx)v\\ + \\{u{xx)-u{x))\|/\\. 

The first term vanishes as ^ \j/ since it is bounded by Ck\\ W — yt\\, whereas the 
second vanishes as -G- x by the (assumed) strong continuity of u. □ 
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Since the first kind of continuity is the usual one for group actions, this justifies the 
choice of strong continuity as the natural one for unitary representations (to which 
a pragmatic point may be added; norm continuity is quite rare for unitary represen¬ 
tations on infinite-dimensional Hilbert spaces). Things further simplify under mild 
restrictions on G and H, which are satisfied in all examples of physical interest. 

Proposition 5.36. If H is separable and G is second countable locally compact 
(sclc), then each of the two continuity conditions in Proposition 5.35 is in turn equiv¬ 
alent to weak measurability ofu, in that for each (p,\j/ G H the function 

X I—>■ {(p,u{x)\l/) 

from G to C is (Borel) measurable. 

Proof. This spectacular result is due to von Neumann, who more generally proved 
that a measurable homomorphism between sclc groups is continuous. This implies 
the claim: first, if H is separable, then the group U (H) is sclc in its weak operator 
topology, so that if the map G — U{H), x i— u{x) is weakly measurable, then it 
is continuous in the weak topology on U{H). Second, for any Hilbert space, weak 
(operator) continuity of a unitary representation implies strong continuity (so that, 
given the trivial converse, weak and strong continuity of unitary group representa¬ 
tions are equivalent). We only prove this last claim: forx,y € G, we compute 

II {u{y) - m(x))v/|| = llwWv/'f + ||M(y)v^f - {u{x)\l/,u{y)\i/) - {u{y)\l/,u{x)Y) 

= 2||rf - (r,M(x^V)r) - {\l/,u{y-^x)\l/), 

Weak continuity obviously implies that the function x i—(yr, m(x)v/) is continuous 
at the identity e G G, so if y = x^ ^ x, then || {u{xx) — u{x))\l/\\ —>■ 0. □ 

In view of this, it is hardly a restriction for a unitary representation of a locally com¬ 
pact group on a Hilbert space to be continuous in the sense of Proposition 5.35, so 
we always assume this in what follows. Furthermore, any group we consider is lo¬ 
cally compact, so this will be a standing assumption, too. An important consequence 
of this assumption is the existence of a translation-invariant measure on G. 

Theorem 5.37. Each locally compact group G has a canonical nonzero (outer reg¬ 
ular Borel) measure p, called Haar measure, which is left-invariant in that 

f dfl{x)Lyf{x)= f dfl{x)f{x), (5.138) 

Jg Jg 

for each f G Cc{G) andy G G, where the left translation Ly of f by y is defined by 

Lyf{x)=f{y-^x). (5.139) 

This measure is unique up to scalar multiplication. Moreover, ifG is compact, then: 
1. p is finite and hence can be normalized to a probability measure, i.e., 

p{G) = l. (5.140) 
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2. /X is also right-invariant in that 

j^dli{x)Ryf{x) = j^dlJ.{x)f{x), (5.141) 

where the right translation Ry of f by y G G is defined by 

Ryf{x)=f{xy). (5.142) 

3. II is invariant under inversion, in that 

[ dli{x)f{x~^)= [ dfx{x)f{x). (5.143) 

Jg Jg 

Existence is due to Haar and uniqueness was first proved by von Neumann. One 
often writes dx = dll{x) for Haar measure. Here are some examples: 

• For G = K", Haar measure equals Lebesgue measure (up to a constant); eqs. 
(5.139) and (5.141) state the familiar translation invariance of /Tl. 

• For G = T, we have 


f 1 

d^iz)f{z) = — dOfie'^). (5.144) 

JT 2.K Jo 

• For G = GL„(]R) with X = (x/;), we have 

m 

dn{X) = dxij\det{X)\-f (5.145) 

0=1 

which for G = 5L„(K) of course simplifies to dpL{X) = Wijdxij. 

Definition 5.38. A unitary representation u of a group G on a Hilbert space H is 
irreducible if the only closed subspaces K of H that are stable under u(G) (in the 
sense that if \j/ G K, then u(x)\j/ G Kfor all x G G) are either K = H or K = {0}. 

We will often need two important results about irreducibility. The first is Schur’s 
Lemma, in which the commutant S' of some subset S C B{H) is defined by 

S'= {aGB{H)\ab = ba'ib'GS). (5.146) 

Lemma 5.39. A unitary representation u of a group G is irreducible iff 

m(G)' = C-1, (5.147) 

i.e., if au{x) = u{x)afor each x G G implies a = X ■ Ih for some A G C. 

This follows from Theorem C.90, of which the above lemma is a special case: take 
A = u{G)" = {u{G)'y. The second is part of the Peter-Weyl Theorem. 

Theorem 5.40. Irreducible representations of compact groups are finite-dimensional. 
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Proof. We first reduce the situation to the unitary case: if {■,■,)' is the given inner 
product on H, we define a new inner product (•,•,) by averaging with respect to 
Haar measure dx = dpL{x), i.e., 


{\j/,(p) = j dx{u{x)\j/,u{x)(p). 


(5.148) 


Using (5.141), it is easy to verify that this new inner product makes u unitary. 

So let M : G — u{H) be an irreducible unitary representation. For each unit vector 
(p G H and x € G, we define the following projection and its G-average: 


eu(x)<p = Hx)(p){u{x)(p\, 


W(o = / dxe 


L 


^u{x)(p- 


The Weyl operator (5.150) is initially defined as a quadratic form by 
{VuW^\l/2) = J^dx{\i/ueu{x)^V2)- 


(5.149) 

(5.150) 


(5.151) 


The integral exists because the integrand is continuous and bounded, defining a 
bounded quadratic form by the estimate |(v^i,W|j,V/ 2 )| < ||V/'i||||V/' 2 ||, where we as¬ 
sumed (5.140) and used = 1, as (5.149) is a nonzero projection. Thus the 

operator W(p may be reconstructed from its matrix elements (5.151), cf. Proposition 
B.79. It is easy to verify that [W,|),M(y)] = 0 for each y G G, so that Schur’s Lemma 
yields W(p = ■ 1 h for some X(p G C. Hence {xj/jW^Xj/) = Xq,\\Xj/W^, in other words, 

[ dx\{xi/,u{x)(p)\^ = X(p\\xi/\\'^. (5.152) 

J G 

If we now interchange (p and y/ and use (5.143) we find so that, 

taking y/ to be a unit vector, too, since xj/ and (p are arbitrary we obtain X(p=Xxi/ = X, 
where in fact A > 0, as follows by taking xj/ = (p in (5.152). Finally, take n or- 
thornormal vectors (ui,..., D„) in H, so that also (m(x)ui,... ,u(x)v„) are orthonor¬ 
mal (since u(x) is unitary), upon which Bessel’s inequality (B.212) gives 


1=1 


(5.153) 


Integrating both sides over G, taking || v/|| = 1, and using (5.140) gives 


JlJ^dx\{xi/,u{x)Vi)\^ < 1 . 


(5.154) 


On the other hand, summing (5.152) over i simply yields nX, whence nX < I, for 
any n < dim(//). Since A > 0 this forces dim(//) < □ 
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5.7 Representations of Lie groups and Lie algebras 

We now assume that G is a Lie group-, as in §3.3, for our purposes we may restrict 
ourselves to linear Lie groups, i.e. closed subgroups of GL„(K) for IK = K or C. 

Let u : G ^ U (H) be a unitary representation of a Lie group G on some Hilbert 
space H (assumed strongly continuous). If H is finite-dimensional, the following 
operation is unproblematic; for A G g (i.e. the Lie algebra of G) we define an operator 

u'(A) : H^H-, (5.155) 

«'(^) = (5.156) 

This gives a linear map u': g^ B{H), which satisfies 

[u'{A),u'{B)] = u'{[A,B])-, (5.157) 

u'{A)* = -u'{A). (5.158) 

Note that physicists use Planck’s constant > 0 and like to write 

%{A) = ihu'{A), (5.159) 

so that one has the following commutation relations and self-adjointness condition: 

[%{A),%{B)]=ih%{[A,B])-, (5.160) 

%{A)* = 7t{A). (5.161) 

If one knows that u': g-G B{H) comes from w.G^U (H), one conversely has 

m(^A) =e“'W (5.162) 

More generally, we call a map p : g B{H) (where // = C" remains finite¬ 
dimensional, so that p : g^ M„{C)), a skew-adjoint representation of p on // if 

[p{A),p{B)]=pi[A,B]y, (5.163) 

p(A)*=-p(A). (5.164) 

The property of irreducibility of such a representation p : g ^ B{H) is defined in 
the same way as for groups, namely that the only linear subspaces of // = C” that 
are stable under p(p) are {0} and H. Equivalently, by Schur’s Lemma, p( 0 ) is irre¬ 
ducible iff the only operators that commute with all n{A) are multiples of the unit 
operator. If p = m' for some unitary representation m(G), it is easy to see that u 
is irreducible iff u' is irreducible. In view of this, it is a reasonable strategy to try 
and construct irreducible unitary representations m(G) by starting, as it were, from 
u'{g). More precisely, if p is some (irreducible) skew-adjoint representation of g, 
we may ask if there is a (necessarily irreducible) unitary representation m(G) such 
that p = u'. Writing exp(p) for u, one would therefore hope that 
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u (/) = eP (/) = (5.165) 

as in (5.162). Note that if G is connected, then p duly defines u(x) for each x G G 
through (5.165), since by Lie theory every element x of a connected Lie group is a 
finite products = exp(Ai) • • •exp(A„) of exponentials of elements (Ai,... ,A„) of g. 

In general, this hope is in vain, since although each operator exp(A) is unitary, the 
representation property u{x)u(y) = u(xy) may fail for global reasons. For example, 
if G = SO{3), then g = with basis {Ji,J 2 ,Ji), as in (3.66). Define an a priori 
linear map p ; g —>■ M 2 (C) by linear extension of 

p{Jk) = -y<^k, (5.166) 

where (cJi, 02: <^ 3 ) the Pauli matrices (5.42), so that physicists would write 

7t{Jk) = IhOk, (5.167) 

cf. (5.159). This is easily checked to give a skew-adjoint representation of g, but it 
does not exponentiate to a unitary representation of SO{3): as already mentioned 
after Proposition 5.46, if u is a unit vector in R^, then a rotation /?e(u) around the 
u-axis by an angle 0 € [0,27r] is represented by 

u{Rg{vi)) =cos(0/2)- l2-l-/sin(0/2)u-O. (5.168) 

Consequently, u{Rj:{u)) = m • O, so that u{Rn{\i)Y = ~l 2 , although within S0{3) 
one has = e, the unit of S0{3), so that u{Rn{u))^ ^ M(^ 7 r(u)^). 

However, p does exponentiate to a representation of SU (2), which happens to 
be the universal covering group of 5G(3). This is typical of the general situation, 
which we state without proofs. We first need a refinement of Lie’s Third Theorem: 

Theorem 5.41. Let G be a connected Lie group G with Lie algebra g. There exists 
a simply connected Lie group G, unique up to isomorphism, such that: 

• The Lie algebra of G is g. 

• G = GlD, where D is a discrete normal subgroup of the center ofG. 

• D ^ 7ti (G), i.e., the fundamental group ofG, which is therefore abelian. 

For example, for G = S0{3) we have G = SU (2) and D = Z 2 , cf. Proposition 5.46. 

Theorem 5.42. Let G\ and G 2 be Lie groups, with Lie algebras gi and Q 2 , respec¬ 
tively, and suppose that G\ is simply connected. Then every Lie algebra homomor¬ 
phism (p : Qi ^ g2 comes from a unique Lie group homomorphism 0 : Gi —>■ G 2 
through (p = <P', where (realizing Gi and G 2 as matrices) 

= (5.169) 

Let H he a finite-dimensional Hilbert space, so that B{H) = M„{C), where n = 
dim(//), and take U(H) = U„{C) to be the group of all unitary matrices on C". The 
Lie algebra u„(C) of U„{C) consists of all skew-adjoint n x n complex matrices. 
Since irreducibility is preserved under the correspondence u{G) O u'(g), we infer: 
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Corollary 5.43. Let G be a simply connected Lie group with Lie algebra g. Any 
finite-dimensional skew-adjoint representation ;r ; g —>■ u„ (C) of g comes from a 
unique unitary representation u(G) through (5.156), in which case we have 

(AGg). (5.170) 


Thus there is a bijective correspondence between finite-dimensional unitary repre¬ 
sentations of G and finite-dimensional skew-adjoint representations of g. In partic¬ 
ular, ifG is compact, this specializes to a bijective correspondence between unitary 
irreducible representations ofG and skew-adjoint irreducible representations of g. 

If G = G/D is connected but not simply connected, then a finite-dimensional 
skew-adjoint representation p ; g -7 B{H) exponentiates to a unitary representation 
u : G ^ U{H) iff the representation exp(p) : G —>■ U{H) is trivial on D. 

For example, G = 5(9(3), the last condition is satisfied for the irreducible repre¬ 
sentations with integer spins 7 G N (as well as for j = 0), see §5.8. 

A similar construction is possible when H is infinite-dimensional, except for the 
fact that the derivative in (5.156) may not exist. For example, G = M has its canonical 
regular representation on// = L^(K), defined by u{a)'^r{x) = \ft{x—a), in which case 
(5.159) gives some multiple of the momentum operator —ihd/dx. This operator is 
unbounded and hence is not defined on all of H, see also §5.11 and §5.12. As in 
Stone’s Theorem 5.73, this problem is solved by finding a suitable domain in H on 
which the underlying limit, taken strongly, does exist. This is the Carding domain 

Dg = {f)wJ& Cf{G), r G //} , (5.171) 

where for each / € C^(G) (or even / € (G)) the operator (/) is defined by 


u 


Hf) = j^dxf{x)u{x). 


(5.172) 


Like the derivative m', this integral is most easily defined weakly, i.e., the (bounded) 
operator u-f (/) is initially defined as a bounded quadratic form 


Qi(P,W)= f dxf{x){(p,u{x)\j/), (5.173) 

JG 

from which the operator u-f (/) may be reconstructed as in Proposition B.79. Note 
that the function X I—>■ {(p,u{x)\l/) is in Cfo(G), so that the integral (5.173) exists. 

It can be shown that Dg is dense in H, as well as invariant under u'{g), in the 
sense that if (// G Dg, then u'{A)\i/ G Dg for any A G g. Furthermore, for each (p G Dg 
the function x i-G- u{x)(p from G to // is smooth (if G is unimodular this property 
even characterizes Dg)- The commutation relations (5.157) then hold on Dg, but 
the equalities (5.164) do not; one has to choose between (5.157) and (5.164), since 
the latter holds for the closure of each 7t{A) (i.e., each ip (A) is essentially self- 
adjoint on Dg), whose domain however depends on A: there is no common domain 
on which each ip{A) is self-adjoint and the commutation relations (5.157) hold. 
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5.8 Irreducible representations of SU (2) 

One of the most important groups in quantum physics is SU (2), both as an internal 
symmetry group—e.g. of the Heisenberg model of ferromagnetism, of the weak nu¬ 
clear interaction, and possibly also of (loop) quantum gravity—and as a spatial sym¬ 
metry group in disguise (all projective unitary representations of SO{3) come from 
unitary representations of SU{2), preserving irreducibility, cf. Corollary 5.61). In 
this section we review the well-known classification and construction of its unitary 
irreducible representations. Since SU (2) is compact, by Theorem 5.40 all its unitary 
irreducible representations are finite-dimensional. Since G = SU (2) is also simply 
connected, by Corollary 5.43 its irreducible finite-dimensional (unitary) represen¬ 
tations u bijectively correspond to the irreducible finite-dimensional skew-adjoint 
representations p — u' of its Lie algebra g. Hence our job is to find the latter. 

We already encountered the basis (3.66) of the Lie algebra 50 ( 3 ) = of 50(3); 
the corresponding basis of the Lie algebra su( 2 ) of SU (2) is (5i , 52 , 53 ), where 

Sk = -fiak, (5.174) 

and the Ot are the Pauli matrices given in (5.42); linear extension of the map S^ 
defines an isomorphism between 50 ( 3 ) and su( 2 ). These matrices satisfy 

[5,-,5,] = e,y,5,, (5.175) 

where Sijk is the totally anti-symmetric symbol with £123 = 1 etc., so that (5.175) 
comes down to [5i,52] = S$, [ 53 ,5i] = S 2 , and [ 52 , 53 ] = 5i. By linearity, finding p 
is the same as finding nxn matrices 


f-'k — ip (5/:) 

(5.176) 

that satisfy 

(5.177) 

i.e., [Li,L 2 ] = * 7 . 3 , etc., and 

Ll=Lk. 

(5.178) 

It turns out to be convenient to introduce the ladder operators 


l-‘± — Tj zb/L2, 

(5.179) 

with ensuing commutation relations 


= ±T±; 

(5.180) 

[L+,L_] =2L3. 

(5.181) 

Furthermore, we define the Casimir operator 


c = l\+lI+lI 

(5.182) 
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which, crucially, commutes with each Lk, i.e., 

[C,L*] =0 (/t= 1,2,3). (5.183) 

By Schur’s lemma, in any irreducible representation we therefore must have 

C = c-Ih, (5.184) 

where c G M (in fact, c > 0). We will also use the additional algebraic relations 

L+L_ = C-L^m - Ih); (5.185) 

L_L+= C-L 3 (L 3 + 1//). (5.186) 

The simple idea is now to diagonalize L 3 , which is possible as L 3 = L 3 . Hence 

H= 0 Ha, (5.187) 

where (y{Lj,) is the spectrum of L 3 (which in this finite-dimensional case consists 
of its eigenvalues), and is the eigenspace of L 3 for eigenvalue X (i.e., if v G 
then LjV —Xv). The structure of (5.187) in irreducible representations is as follows. 

Lemma 5.44. Let p : su( 2 ) —>■ B{H) be a finite-dimensional skew-adjoint irre¬ 
ducible representation, so that (5.177) holds. Then the spectrum o{Lf) of the self- 
adjoint operator Lt, = ip{S^) is given by 

a{L^) = {-j-j+\,---,j-lj}. (5.188) 

If (5.187) is the spectral decomposition ofH relative to L 3 , then: 

1. The subspace Hx is one-dimensional for each X G o{L'i); 

2. For X < j the operator L+ maps Hx to Hx+\, whereas L+ =0 on Hj; 

3. For X > — j the operator L maps Hx to Hx-i, whereas L_ = 0 on H^j. 

Proof. For any X G (y{L^) and nonzero Vx G Hx, we have: 

• either A + 1 G Ct(Lj,) and L+Ua & Hx+i (as a nonzero vector); 

• orL+ttA =0. 

Indeed, (5.180) gives L${L^Vx) = (A + 1)L+Da, which immediately yields the 
claim. Similarly, either A — 1 G (y{Lj) and L-Ua G Ha_i, or L^Vx = 0. Now let 
Ao = mino’(L 3 ) be the smallest eigenvalue of L 3 , and pick some 0 3^ ttAg G Hx^. 
Since H is finite-dimensional by assumption, there must be some k G Nq = N U {0} 
such that L^^Vxq = 0 , whereas all vectors L+Dao for Z = 0 ,... ,A: are nonzero (and 
lie in Hx^^^i). With c defined as in (5.184), it then follows from (5.185) - (5.186) that 

c-Ao(Ao-l) =0; (5.189) 

c-{Xo + k){Xo + k+l) = 0 . (5.190) 
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These relations imply Ao = — A:/2, so that by the above bullet points we also have 

{-fe/2,-A/2+1,...,A/2- l,A/2} C c7(L3). (5.191) 

To prove equality, as in (5.188), consider the vector space 

i/' = C-UAo©C-L+t;Ao©---©4^'uAo®4i^A<, C7A; (5.192) 

this is just the subspace of H with basis By the 

previous arguments following from (5.180), we see that the operators L+ and L_ 
never leave H', and the same is trivially true for L 3 . Therefore, if p is irreducible, 
then we must have H' — H (and conversely). All claims of the lemma are now 
trivially verified on H'. □ 

It should be clear from this proof that the actions of L^,L_, and L 3 (and hence of all 
elements of su( 2 )) on H' = H) are fixed, so that p is determined by its dimension 

dim(//) =2; + l, (5.193) 

from which it follows that j can only take the values 0,1 / 2 , 1 ,3/ 2 ,.... 

It remains to fix an inner product on H' in which p is skew-adjoint, i.e., in which 
L 3 = L 3 and L*^ = L (which implies that Lj = Li and Lj = L, 2 , which jointly imply 
p{X*) = —p{X) for any X G g). This may be done in principle by starting with 
any inner product, integrating p to a unitary representation of SU (2), and using the 
construction explained at the beginning of the proof of Theorem 5.40. In practice, it 
is easier to just calculate: take // = C” with n = 2j +1, standard inner product, and 
standard orthonormal basis (m;), labeled as Z = 0,1,...,2/). Then put 


L-iUi = {l-j)ui\ 

(5.194) 

L+ui = \/{l + l){n-l- 1)m/+i; 

(5.195) 

L-Ui = s/l{n-l)ui^i. 

(5.196) 


Note that (5.195) is even formally correct for l = 2j, since in that case n — 2/ — 1 = 0, 
and similarly, (5.196) formally holds even for I = 0. The commutation relations 
(5.180) - (5.181) as well as the above conditions for skew-adjointness may be ex¬ 
plicitly verified, from which it follows that for any prescribed dimension (5.193) we 
have found a skew-adjoint realization of p. Clearly, m/ = D/_/. 

In view of Theorem 5.40 and Corollary 5.43 we have therefore proved: 

Theorem 5.45. Up to unitary equivalence, any (unitary) irreducible representation 
ofSU (2) is completely determined by its dimension n — dim(//), and any dimension 
n G No = N U {0} occurs. Furthermore, if j is the number in (5.188), we have 

n = 2j+l. (5.197) 

Physicists typically label these irreducible representations by j (called the spin of 
the given representation) rather than by n, or even by c = /(/ + 1), cf. (5.184). 
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Corollary 5.43 shows that one may pass from p (su( 2 )) to a unitary representation 
u{SU (2)), of which one may give a direct realization. For j G No/2, define Hj as the 
complex vector space of all homogeneous polynomials p in two variables z={zi,Z 2 ) 
of degree 2j. A basis of Hj is given by • • • , 2122 '^ ', 22 ^, which has 

2/ + 1 elements. So dim(//,) = 2/ + 1. Then consider the map 


Dj : SU{2) ^ B{Hj)\ 
Dj{u)fiz) = f{zu). 


Clearly, 


so Dj{e) = 1, and 


Dj{e)f{z)= f{z-\ 2 )= f{z)), 


(5.198) 

(5.199) 

(5.200) 


Dj{u)Dj{v)f{z)=Dj{v)f{zu)= f{zuv)=Dj{uv)f{z), 

so Dj{u)Dj{v) = Dj{uv). Hence Dj is a representation of SU{2). 

We now compute L 3 = —^Sj on this space. From (5.156) with u Dj, we have 


L 3 


-\iD) 




fe“ 0 
0 e-“ 


t=Q 


SO that 


Lifiz) 


-4f(A^^z,,e-‘‘z2 


t 1 f. . dfiz)\ 


Similarly, we obtain 


L+f{z) = zi 


L-f(z) = Z 2 


^f(z) . 
dz2 
dfjz) 
dzi 


(5.201) 


(5.202) 


(5.203) 

(5.204) 


Hence f2j{z) = z]^ gives L3/2; = jf2j, and fo{z) = Zj'' gives L3/0 = -jfo. In 
general, /;(z) = ‘ spans the eigenspace of L3 with eigenvalue X = — j + 1 . 

Since / = 0,1,... ,2/, this confirms (5.188), as well as the fact that the corresponding 
eigenspaces are all one-dimensional. The rest is easily checked, too, except for the 
unitarity of the representation, for which we refer to the proof of Theorem 5.40. 

Finally, we return to 56>(3). Either explicit exponentiation (5.165), as done for 
7 = 1/2 in (5.168), or the above construction of Dj, allows one to verify the crucial 
condition stated in Corollary 5.43, namely that Dj{d) = Ihj for 5 G D = Z 2 , which 
comes down to 12 ) = Ihj- This is easily seen to be the case iff j G Nq. 

Corollary 5.46. Up to unitary equivalence, each unitary irreducible representation 
ofSO{3) is completely fixed by its dimension n = 2/ +1, where j G No (so that n = I 
for spin-0, n = 3 for spin-1, n = 5 for spin-2, ...), and each such dimension occurs. 
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5.9 Irreducible representations of compact Lie groups 

Because of its importance for the classical-quantum correspondence (cf. §7.1) we 
first reformulate the main result of the previous section (i.e. the classification the 
irreducible representations of SU (2)) and on that basis generalize this result to arbi¬ 
trary compact Lie groups. This gives a classification of great simplicity and beauty. 

We already encountered the coadjoint representation (3.100) of a Lie group G on 
0 *, given by (x- 9){A) = 0(x^'Ax), where x £ G, 9 £ q*, A G g. The orbits under 
this action are called coadjoint orbits. If G = SO{3), we have 0 = under the map 

3 

X-J= ^Xx7,'!->■ (xi,X 2 ,X 3 ) =x, (5.205) 

k=\ 


where the matrices 7^ are given in (3.66). Hence also 0 * = under the map 


0H4 



(5.206) 


Writing R £ SO(3) for a generic element x £ G, analogously to (5.44), we can com¬ 
pute the adoint action R :Ai-^ RAR^^, seen as an action on through 

= (7?x)-J. (5.207) 


Using the fact that the angular momentum matrices transform as vectors, i.e., 

RJiR-^ =Y,RjiJj, (5.208) 

j 

we find that the adjoint action of SO{3) on 0 , seen as is its defining action. In 
general, if 0 = R" and also 0 * = K” under the usual pairing of R” and R" through 
the Euclidean inner product, the coadjoint action of G on 0 *, seen as an action on 
R”, is given by the inverse transpose of the adjoint action on 0 = R". For SO{3) we 
have {R^^Y — R, so the coadjoint action of SO{3) on R^ is just its defining action, 
too, and hence the coadjoint orbits are the 2 -spheres Sr with radius r > 0 . 

Turning to SU{2), we now make the identification of 0 * with R^ slightly differ¬ 
ently, namely by replacing the 3x3 real matrices 7, in (5.205) by the 2x2 matrices 
Si in (5.174), but the computation is similar: using (5.44) - (5.45), we find that the 
coadjoint action of m G SU (2) on R^ is given by the defining action of 7t{u) £ SO(3), 
cf. (5.46). It follows that the coadjoint orbits for SU(2) are the same as for SO(3). 

Returning to general Lie groups G for the moment, assumed connected for sim¬ 
plicity, we take some coadjoint orbit ^ C 0 *, fix a point 9 £ ^ (so that = G- 9 = 
Gg), and look at the stabilizer Gg and its Lie algebra 00 . Since the derivative Ad' of 
the adjoint action Ad of G on 0 —defined as in (5.156)—is given by 

Ad'{A):B^^[A,B], (5.209) 
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it follows that the “infinitesimal stabilizer” ge is given by 

fle = {AGfl| 0([A,B])=OVBG0}. (5.210) 

Consequently, the restriction of 0 ; g —K to gg C g is a Lie algebra homomorphism 
(where R is obviously endowed with the zero Lie bracket). Consider a character 
X ■ Ge ^ T, which is the same thing as a one-dimensional unitary representation 
of Gg. If we regard T as a closed subgroup of GLi(C), its Lie algebra t is given 
by® C Ml (C) = C. It is conventional (at least among physicists) to take —i as 
the basis element of i, so that t = M under —it o t, so that the exponential map 
exp ; t —>■ T (which is the usual one), seen as a map from R to T, is given by f i—>■ 
exp(—it). Defining the derivative : gg —C as in (5.156), it follows that actually 
X' ■ Se ^ so that ix' maps gg to R and is a Lie algebra homomorphism. 

Definition 5.47. Let G be a connected Lie group. A coadjoint orbit ^ C g* is called 
integral if for some (and hence all) 9 G one has 0|gg = ix' for some character 
X : Gg —>■ T, i.e., if there is a character x such that for each A G gg one has 

In the simplest case where G = T, the coadjoint action on t* is evidently trivial, so 
that Gg = G = T for any 0 G t* = R. Furthermore, any character on T takes the 
form Xn{z) = z", where n G Z, cf. (C.351). As explained above, if i = R and hence 
also t* = R, the identification of A G t* with A G R is made by A(—/) o A, where 
—iGt.lfX — Xn, the right-hand side of (5.211) evaluated at A = — / equals n, so that 
(5.211) holds iff 0 = n for some n G Z. Thus the integral coadjoint orbits in t* are 
the integers Z C R. Similarly, if G = the characters are elements of Z^, as in 

X{ni,...,na){zi,---,Zd)=z'f ■■■z'/, (5.212) 

and the integral coadjoint orbits in g* = R'^ are the points of the lattice Z'^ C R^. 

For G = SU (2) we take a coadjoint orbit C R^ and fix 0^ = (0,0, r). If r = 0, 
then Gg=G and (5.211) holds for the trivial character j = 1, so the orbit {(0,0,0)} 
is integral. Let r > 0. Then Gg,. = Gr consist of the pre-image of SO{2) in SU (2) 
under the projection n in (5.46), where SO{2) C SO{3) is the group of rotations 
around the z-axis. This is the abelian group 

7’ = {diag(z,z) |zGT}. (5.213) 

This group is isomorphic to T under diag(z,z) i—z and hence its characters are 
given by x„(diag(z,z)) = z”, where n G Z. The identification g* = R^ is made by 
identifying 0 G g* with ( 0 i, 02 , 03 ), where 0i = 0(5,). Putting A = S$ in (5.211), 
see (5.174), therefore gives r = n/2 for some n G N. We conclude that the coadjoint 
orbits for SU{2) are given by the two-spheres C R^ with r G No/2. 

Similarly, for G = 5G(3) the stabilizer of (0,0, r) is SO{2) = T itself, and putting 
A = 73 in (5.211) one finds that the coadjoint orbits are the spheres 5/ with r G No. 
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For any (Lie) group G, let the unitary dual G be the set whose elements are 
equivalence classes of unitary irreducible representations of G, where we say: 

Definition 5.48. 7vvo unitary representations ut : G ^ U{Hi), i — 1,2, are equiva¬ 
lent if there is unitary v. H\ H 2 such that U 2 (x) = VMi {x)v* for each x € G. 

The examples G = as well as for G = SU (2) now suggest the following theorem: 

Theorem 5.49. If G is a compact connected Lie group, then the unitary dual G is 
parametrized by the set of integral coadjoint orbits in p*. 

Furthermore, there is an explicit (geometric) procedure to a construct an irreducible 
representation Uff corresponding to such an orbit, namely by the method of geo¬ 
metric quantization. We will not explain this method, which would require some 
reasonably advanced differential geometry, but instead we outline the connection 
between coadjoint orbits and the well-known method of the highest weight. 

Let G be a compact connected Lie group and pick a maximal torus T CG. Let 

Wt=N{T)/T (5.214) 

be the corresponding Weyl group, where N{T) is the normalizer of T in G (i.e., 
X G N{T) iff xzx^* G T for each z G T). Note that all maximal tori in compact 
connected Lie groups are conjugate, so that the specific choice of T is irrelevant. 

For example, for SU{2) we take (5.213), in which case N(T) is generated by T 
and (7i G SU{2), so that W = & 2 , i.e., the permutation group on two variables. In 
general the Weyl group inherits the adjoint action of N{T) on T, so that Wj acts on 
T and hence also acts on t and t*; for SU{2) the action of the nontrivial element of 
Wt, i.e., image [cTi] of (7i G N{T) in N{T)/T), on T is given by 

[(7i](diag(z,z)) =diag(z,z), (5.215) 

so that its action on T = T is z i-G z, which gives rise to actions A i-G —A of Wt on t 
and hence X 1 — —X of Wj on t*. This is a special case of the following bijection: 

flVG^tVWr, (5.216) 

where the G-action on g* is the coadjoint one; globally, one has G/Ad(G) = T /Wj. 

Indeed, for SU{2) the left-hand side of (5.216) is the set of spheres in 
r > 0, whereas the right-hand side is R /©2 (where ©2 acts on M by 0 1 —>■ —0). 

In general, a given coadjoint orbit ^ C g* defines a Weyl group orbit in t* 
as follows: contains a point 0 for which T CGg, and we take to be the orbit 
through 0|j. Conversely, any G-invariant inner product on g induces a decomposition 

g = i©t^, (5.217) 

which yields an extension of A G t* to 9i G g* that vanishes on f^. Let A C i* be 
the set of integral elements in i* (as explained after Definition 5.47). Elements of A 
are called weights. Theorem 5.51 below gives a parametrization 
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G^A/Wt, 


(5.218) 


which, restricting (5.216) to the integral part Act*, implies Theorem 5.49. 

Instead of with the quotient A/Wj, one may prefer to work with A itself, as 
follows: we say that A G t* is regular if w • A for w G Wt iff w = e; this is the case 
iff A = 0|t with Gg = T. For SU{2) all weights A G Z are regular except A = 0. 
The set i* of regular elements of i* falls apart into connected components C, called 
Weyl chambers, which are mapped into each other by Wt- For 51/(2) one has t* = 
(—00,0) U (0,oo), so that the Weyl chambers are (—oo^O) and (0,oo). 

One picks an arbitrary Weyl chamber Cd (for SU (2) this is (0, oo)) and forms 

A^=AnC^, (5.219) 

where CJ is the closure of Q in t*. Elements of Ad are called dominant weights. 
For each element of A /Wj there is a unique dominant weight representing it in A, 
so that instead of (5.218) we may also write what Theorem 5.51 actually gives, viz. 

G'^Ad- (5.220) 

To explain this in some detail, we need further preparation. Any (unitary) represen¬ 
tation u: G^ U{H) on some finite-dimensional Hilbert space H restricts to T, and 
since T is abelian, we may simultaneously diagonalize all operators u{z), z&T. The 
operators iu'{A), where A G t, commute as well, so that we may decompose 

(5.221) 

^eAu 

where Ah C A contains the weights that occur in u\^t^ so that for each \j/ G H^, 

u(z)w == Xiiiz)¥ {z G r); (5.222) 

/m' (Z) V/ = ft (A) r (Z G i), (5.223) 

where the character Xii ■ T ^ T corresponding to the weight /r G A is defined as 
in (5.212) with jx = (ni,.. .,nd) and z = (zi,... ,Zd) G T = where d = dim(7’). 
For example, we have seen that the irreducible representations Dj{SU{2)) on Hj = 
contains weights in Aj = {—j, —j -f 1,..., 7 — 1, 7 }, where 7 G No/2. 

In particular, take // = flc with some G-invariant inner product, cf. (5.148), and 
take u = Ad, given by Ad(x)B = xBx^^, so that Ad'(A)(B) = [A,B\, extended from 
0 to gc: we write gc = 0 + i 0 and hence put Ad'(A) (B + iC) = [A,B] -|-/[A,C], where 
A,B,C G g. We assume that the inner product (•,•,) on gc is obtained from a real 
inner product on g by complexification. This inner product on g may be restricted 
to t C 0 and hence induces an inner product on t*, also denoted by (•,•,). For ex¬ 
ample, if G is semi-simple (like SU{2)), one may take the inner product on g and 
hence on gc to be the Cartan-Killing form {A,B) = — jTr (Ad'(A)Ad'(Z?)), which is 
nondegenerate because G is semi-simple, and positive definite since G is compact. 
For SU (2) or SO{3) this gives the usual inner product on and 
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Definition 5.50. The roots ofQ are the nonzero weights of the adjoint representation 
u = Ad on H = gc- That is, writing A (Z A for the set of roots, we have a G A iff 
a : t — >■ K /s not identically zero and there is some Ea G gc such that for each Z G i, 

i[Z,Ea] = a{Z)Ea, (5.224) 

cf. (5.223). Furthermore, subject to the choice of a preferred Weyl chamber in i*, 
we say a G A is positive, denoted by a G A+, if{a,X) > 0 for each X G Q. 

Since (a, A) is real and nonzero for each a GA and A G Cd, one has either a G A + or 
— a G A+, i.e., a GA^ = — A+. Since tis maximal abelian in g, it can also be shown 
that each root is nondegenerate. Writing ga = C • Ea, this gives a decomposition 

0C = tc 0 fla 0 fla. (5.225) 

aeA+ aeA^ 

For G = SU (2), the single generator of i is S^, and taking E± = i{Si ± iS 2 ), we see 
from (5.180) that = ifsi. Hence the roots are a±, given by a±(53) = ±1, 

and with (0,°°) as the Weyl chamber of choice, the root a+ is the positive one. 

We now define a partial ordering < on A by putting /r < A iff A — /r = 
for some n, G No and a, G A+. This brings us to the theorem of the highest weight: 

Theorem 5.51. Let G be a connected compact Lie group. There is a parametrization 
G = Ad, such that any unitary irreducible representation u^ : G ^ Tlx in the class 
X GG defined by a given dominant weight X G Ad has the following properties: 

1. Tlx contains a unit vector Vx, unique up to a phase, such that 

iu'x{Z)vx = A iz)vx (z G i); (5.226) 

iu'x{Ea)vx=0 iaGA+). (5.227) 

2. Any other weight jl occurring in H, cf. (5.221), satisfies p < X and p f X. 

The crucial point is that eqs. (5.226) - (5.227) imply 

ex{A) = i{vx,u'x{A)vx) (AGg), (5.228) 

where Ox G g* was defined after (5.217) by A G A^/ C t*. Since each operator ux{x) 
is unitary, each vector ux (x)U;i is a unit vector, so we may form the G-orbit 

= {\ux{x)'Ox){ux{x)vx\,x G G} (5.229) 

through |i);l)(u;l| in the space !^\{Hx) of all one-dimensional projections on Hx. 
Denoting the coadjoint orbit G • Ox eg* by ffx, where A = (0A)|t> the map 

x-9x^\ux{x)vx){ux{x)vx\, (5.230) 

is a G-equivariant diffeomorphism (in fact, a symplectomorphism) from ^x to . 
This amplifies Theorem 5.49 by making the the bijective correspondence between 
the set Ad of dominant weights and the set of integral coadjoint orbits explicit. 
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5.10 Symmetry groups and projective representations 

Despite the power and beauty of unitary group representations in mathematics, in 
the context of e.g. Wigner’s Theorem we have seen that in physics one should look at 
homomorphisms x i—^ W(x), where W(x) is a symmetry of (//). In view of The¬ 
orems 5.4, this is equivalent to considering a single homomorphism h:G^ , cf. 
(5.136). To simplify the discussion, we now drop Ua{H) from consideration andjust 
deal with the connected component = U (H) /T of the identity. This restriction 
may be justified by noting that in what follows we will only deal with symme¬ 
tries given by connected Lie groups, which have the property that each element is a 
product of squares x = y^. In that case, h{x) = h{y)^ is always a square and hence 
it cannot lie in the component Ua (H )/T (the anti-unitary case does play a role as 
soon as discrete symmetries are studied, such as time inversion, parity, or charge 
conjugation). Thus in what follows we will study continuous homomorphisms 

h:G^U{H)/T, (5.231) 

where U (H) /T has the quotient topology inherited from the strong operator topol¬ 
ogy on U(H), as explained above. Since it is inconvenient to deal with such a quo¬ 
tient, we try to lift h to some map (5.137) where, in terms of the canonical projection 

7t:U{H) ^U{H)/T, (5.232) 

which is evidently a group homomorphism, we have 

nou = h. (5.233) 

This can be done by choosing a cross-section s of n, that is, a measurable map 

s:U{H)/T ^U{H), (5.234) 

or (this doesn’t matter much) a map s : h{G) /T —>■ f/(//), such that 

;ros = id. (5.235) 

Given h, such a cross-section s yields a map u \ G^U (H) through 

u = soh-, (5.236) 

in particular, n{u{x)) = h{x). Such a lift often loses the homomorphism property, 
though in a controlled way, as follows. Since different choices of s must differ by a 
phase, and /z is a homomorphism of groups, there must be a function 

c:GxG^T (5.237) 

such that 

u{x)u{y) = c{x,y)u{xy) (x,y G G). (5.238) 
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Indeed, since % and h are homomorphisms, we may compute 

%{u{x)u(y)u{xy)^^) = %{s{h{x))%{s{h{y))%{s{h{xy)))^^ 

= h{xy)h{xyy^ = h{eG) = ec/(//)/T- 

Hence u{x)u{y)u{xy)^^ S ' (^c/(//)/t) = T' which yields (5.238), or, more 
directly, 

c{x,y) ■ Ih = u{x)u{y)u{xy)*. (5.239) 

Associativity of multiplication in G and the homomorphism property of h yield 

c(x,y)c(xy,z) =c(x,yz)c(y,z), (5.240) 

and if we impose the natural requirement Mg = 1//, we also have 

c{e,x) = c{x,e) = 1. (5.241) 


Definition 5.52. A function c : G x G —>■ T satisfying (5.240) and (5.241) is called a 
multiplier or C@2-cocycle on G (in the topological case one requires c to be Borel 
measurable, and for Lie groups it should in addition be smooth near the identity). 
The set of such multipliers, seen as an abelian group under (pointwise) operations 
in T, is denoted by Z^(G,T). Ifc takes the form 


c{x,y) 


bjxy) 

b{x)b{y) ’ 


(5.242) 


where : G —>■ T satisfies b{e) = 1 (and is measurable and smooth near e as appro¬ 
priate), then c is called a 2-coboundary or an exact multiplier. The set of trivial 
multipliers forms a (normal) subgroup B^{G,T) ofZ^{G,T), and the quotient 


H^(G,T) 


z2(G,T) 

B2(g,T) 


(5.243) 


is called the second cohomology group o/G with coefficients in T. 

The reason 2-coboundaries and the ensuing group H^{G,T) are interesting for our 
problem is as follows. Given a map x i— u{x) from G to G (H) with (5.238), suppose 
we change u{x) to m(x)' = b{x)u(x). The associated multiplier then changes to 


c'{x,y) 


b{x)b{y) 

b{xy) 


c{x,y), 


(5.244) 


in that u{x)'u{y)' = c'(x,y)u'^y. In particular, a multiplier of the form (5.242) may be 
removed by such a transformation, and is accordingly called exact. 

Proposition 5.53. IfH^{G, T) is trivial, then any multiplier can be removed by mod¬ 
ifying the lift u of h, and the ensuing map u' : G ^ U (H) is a homomorphism 
and hence a unitary representation of G on H. In that case, any homomorphism 
G (H) /T comes from a unitary representation u : G (H) through (5.233). 
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This is true by construction. By the same token, if i/^(G,T) is non-trivial, then G 
will have projective representations that cannot be turned into ordinary ones by a 
change of phase (for it can be shown that any multiplier c G Z^(G, T) is realized by 
some projective representation). Thus it is important to compute H^{G,T) for any 
given (physically relevant) group G, and see what can be done if it is non-trivial. 

To this end we present the main results of practical use. In order to state one of 
the main results (Whitehead’s Lemma), we need to set up a cohomology theory for 
0 (which we only need with trivial coefficients). Let C*( 0 ,R) be the abelian group 
of all k-linear totally antisymmetric maps ^ ; 0 ^ -> M, with coboundary maps 

: C'^(0,K) ^ C*^+‘(0,K); (5.245) 

*+1 

{Xo,Xu...,Xk) ^ Y. {-iy^W{[Xi,Xj],Xo,...,Xi,---,Xj,...,Xk),(5.246) 

i<j=l 


where the hat means that the corresponding entry is omitted. For example, we have 
5^^^(p{Xo,Xi) = -(p{[Xo,Xi]y, 

5^^^(p{Xo,XuX2) = -(p{[Xo,Xi],X2) + (p{[Xo,X2],Xi)-(p{[XuX2],Xo). 

These maps satisfy “5^ = 0”, or, more precisely. 


5{<=+i)o5« = 0 , 


(5.247) 


and hence we may define the following abelian groups: 

B^(0,K) =ran(5('^-^)); 
Z^( 0 ,R) =ker(5W); 

Z^( 0 ,M) 




Bk{g,R)- 


(5.248) 

(5.249) 

(5.250) 


Note that C //^( 0 ,R) because of (5.247). In particular, for k = 2 the group 

Z^( 0 ,K) of all 2-cocycles on 0 consists of all bilinear maps 9 : 0 x 0 —M that satisfy 


(p{X,Y) = -(p{Y,xy, (5.251) 

(p{X, [Y,Z]) + (p{Z, [X,Y]) + (p{Y, [Z,X]) = 0, (5.252) 

and its subgroup B^( 0 ,]R) of all 2-coboundaries comprises all ^ taking the form 

(p(X,T) = 0([X,T]), @€ 0 *. (5.253) 

For example, for 0 = M any antisymmetric bilinear map ^ —>■ 0 is zero, so that 

H^{R,R)=0. (5.254) 
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This has nothing to with the fact that the Lie bracket on g vanishes. Indeed, g = 
does admit a unique nontrivial 2 -cocycle, given by (half) the symplectic form, i.e., 

(Po{ip,q),ip',q')) = k{pq'-qp'). (5.255) 

Since = 0, this cannot be removed, hence (5.255) generates 

(5.256) 

As far as cohomology is concerned, each Lie group and each Lie algebra has its 
own story, although in some cases a group of stories may be collected into a single 
narrative. As a case in point, a Lie algebra g is called simple when it has no proper 
ideals, and semi-simple when it has no commutative ideals. A Lie algebra is semi¬ 
simple iff it is a direct sum of simple Lie algebras. If a Lie group G is (semi-) simple, 
then so is its Lie algebra g. A basic result, often called Whitehead’s Lemma, is: 

Lemma 5.54. If q is semi-simple, then = 0. 

Proof. The key point is that C^(g,R) is a g-module under the action 

k 

(Ao • (p)(Xi.... = - ^ (p(Ai,..., [Ao,X,■],... . (5.257) 

1=1 

For /: = 2, a simple computation shows that 

(Ao-(p)(Ai,A2) = -(p{[Xo,Xi],X2)-(piXi,[XQ,X2]) 

= 5P)(p(Ao,Ai,A2)-5('V(^o,-)(^i,^2), (5.258) 

where at fixed Xq, the map (p{Xo,—) is seen as an element of C*(g,K). This show 
that g maps both B^(g,]R) and Z^(g,]R) onto itself. Indeed, if ^ then the 

first term in (5.258) vanishes because o = 0, cf. (5.247), so that the right- 
hand side of (5.258) takes the form • •) and hence lies in Z?^(g,R). Similarly, 
if = 0, then ■ (p) — 0. We now use the fact that if g is semi-simple, 

then any finite-dimensional module is completely reducible. Consequently, as a g- 
module, Z^(g,K) must decompose as Z^(g,]R) = B^(g,K) ©V, where V is some 
g-module. Hence if ^ S V, then Xq- (p GV. Since (p G Z^(g,]R), the first term in 
(5.258) vanishes, whilst the second term lies in B^(g,K). Since V nB^(g,K) = {0}, 
we therefore have Xo ■ (p = 0, and hence —)(Ai,A 2 ) = 0 , which gives 

(p{Xo, [Ai, A 2 ]) = 0, for all Xo,Xi,X 2 G g. At this point we use another implication of 
the semi-simplicity of g, namely [g,g] = g. It follows that (p =0, whence V = {0}, 
from which z 2 (g,]R) = ^^(g.K), or, in other words, //^(g,K) =0. □ 

Theorem 5.55. Let G be a connected and simply connected Lie group. Then 

H^{G,T)^H^{q,’^). (5.259) 

Proof. This is really a conjunction of two isomorphisms: 
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(5.260) 

H^{G,^)'^H^{q,R), (5.261) 

where K is the usual additive group, and Z^(G,K), B^(G,K), and hence //^(G,K) 
are defined analogously to (G, T) etc. The first isomorphism is simply induced by 

z2(G,]R) z2(G,T); (5.262) 

r{x,y) !->■ =c{x,y), (5.263) 

which preserves exactness and induces an isomorphism in cohomology (but note 
that (5.262) - (5.263) may not itself define an isomorphism). 

The isomorphism (5.261) is induced at the cochain level, too. Given a cocycle 
(p G Z^(G,R), we construct a new Lie algebra g,p (called a central extension of g) 
by taking 0 ,p = g © R as a vector space, equipped though with the unusual bracket 

[iX,v),iY,w)] = {[X,Y],(p{X,Y)y, (5.264) 

the condition (p G Z^(G,R) guarantees that this is a Lie bracket. Furthermore, g,p 
is isomorphic (as a Lie algebra) to a direct sum iff (p G B^(g,]R); indeed, if (5.253) 
holds, then (2f,v) i—>■ {X,v-\- Q{X)) yields the desired isomorphism g^ —>■ g©R. 

By Lie’s Third Theorem, there is a connected and simply connected Lie group 
Gip (again called a central extension of G), with Lie algebra As a manifold, 
Gip = G X M, but the group laws are given, in terms of a function L : G x G —M, by 

{x,v)-{y,w) = {xy,v + w + r{x,y)y, (5.265) 

V —F(x,x^*)). (5.266) 

The group axioms then imply (indeed, they are equivalent to) the condition F G 
Z^(G,]R). Furthermore, two such extensions G^ and G'^ are isomorphic iff the cor¬ 
responding cocycles F and F' are related by (5.244), and in particular, F GB^ (G, K) 
iff G(p is isomorphic (as a Lie group) to a direct product G x M, which in turn is the 
case iff (p G B^(g,R). Conversely, given F G Z^(G,R), we define the central exten¬ 
sion G(p by (5.265) - (5.266), to find that the associated Lie algebra g,p takes the 
above form, defining (p G B'^{g,R.) through (5.264). Explicitly, 

= ijt (5.267) 

Lie’s Third Theorem thus implies that the map (p gg F (which is not necessarily a 
bijection) descends to an isomorphism H^{g,R) -G H^{G, R) in cohomology. □ 

Given (5.254), Theorem 5.55 immediately gives 


//2(R,T)=0. (5.268) 

In particular, if R is the relevant symmetry group, which is the case e.g. with time 
translation, by Proposition 5.53 we may restrict ourselves to unitary representations. 
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Once again, this has nothing to do with abelianness or topological triviality of R. 
Indeed, for G = g = the Heisenberg cocycle (5.255) comes from the multiplier 

co{{p,q),{p',q'))=e‘^P‘^'-‘iP'y\ (5.269) 

where is seen as the group of translations in the phase space R^ of a particle 
moving on R. Accordingly, this multiplier is realized by the following projective 
representation of R^ on L^(R): 

u{p,q)\lf{x) = e^‘’’‘^^^e‘^'’\j/{x — q). (5.270) 

If R^ is the configuration space of some particle, and the group R^ produces trans¬ 
lations in the latter (i.e., of position), then the appropriate unitary representation 
would rather be on L^(R^) and would have trivial multiplier, viz. 

u{qi,q2)\ir{xi,X2) = Wixi-qi,X2-q2)- (5.271) 

Similarly, G = R^, now seen as generating translations of momentum in the phase 
space R^ of the latter example would appropriately be represented on L^(R^) as 

m(^i,^2)Wxi,X2 ) (5.272) 

Corollary 5.56. Let G be a connected and simply connected semi-simple Lie group. 
Then H^{G,T) is trivial. 

Here we say that a Lie group is simple when it has no proper connected normal sub¬ 
groups, and semi-simple if it has no proper connected normal abelian subgroups. 
For example, the “classical Lie groups” of Weyl are semi-simple, including 5G(3) 
and SU{2), which are even simple (note that the latter does have a discrete nor¬ 
mal subgroup, namely its center {± 12 } = Z 2 ). Also, products of simple Lie groups 
are semi-simple. However, Corollary 5.56 does not apply to SO{3), which is semi¬ 
simple but not simply connected. Here the relevant general result is: 

Theorem 5.57. Let G be a connected Lie group with H^{q,M.) = 0. Then 

H^{G,T)'^7t^). (5.273) 

We need some background (cf. §C.15). For any abelian (topological) group A, the 
set 

A=Hom(A,T) (5.274) 

consists of all (continuous) homomorphisms (also called characters) J : A —T; 
these are just the irreducible (and hence necessarily one-dimensional) unitary rep¬ 
resentations of A. This set is a group under the obvious pointwise operations 

X\X2{a) = X\{a)X2{a)\ (5.275) 

X-\a)=x{a)-\ (5.276) 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



5.10 Symmetry groups and projective representations 


173 


As such, the group A is called the (Pontryagin) dual of A; the Pontryagin Duality 
Theorem states that A = A. Using Theorem 5.57 and Theorem 5.41, this gives 

H^{S0{3),T)=Z2. (5.277) 

We now use Theorem 5.41 as a lemma to prove Theorem 5.57: 

Proof. We first state the map Ti\ (G) —//^(G,T) that will turn out to be an isomor¬ 
phism. Assuming Theorem 5.41, pick a (Borel measurable) cross-section 

S-.G^G (5.278) 

of the canonical projection 

n-.G^G = GlD. (5.279) 

As always, this means that nos = idp, and s is supposed to be smooth near the 
identity, and chosen such that s{eG) = eg, where ec and eg are the unit elements of 

G and G, respectively. Given a character X G TTi (G), define C;^: G x G —T by 

Cxix,y) = X{s(x)s(y)s(xy)-^). (5.280) 

This makes sense: jr is a homomorphism, so that (cf. the computation below (5.238)) 

7 f(s(x)s(y)s(xy)^^) = n(s(x))n(s(y}}n(s(xy)y^ =xy{xyy^ =eG, 

and hence s(x)s(y)s(.ry)^^) G ker(7f) = D (where we identify D with 7ri(G), cf. 
Theorem 5.41). Furthermore, tedious computations show that (5.240) and (5.241) 
hold, so that G Z^(G,T). Different choices of s lead to equivalent 2-cocycles c, 
and hence by taking the cohomology class [c;^] of we obtain an injective map 

n[{G) ^ H^{G,T); (5.281) 

Z ^ bz]- (5.282) 

To prove surjectivity of this map, let c G Z^(G,T) and define c : G x G —T by 

c{x,y) =c{n{x),n{y)). (5.283) 

Conversely, we may recover c from c and some cross-section s : G —G of if by 

c(x,y) =c(s(x),s(y)). (5.284) 

It follows that c G Z^(G,T). Theorem 5.55 implies that H^{G,T) is trivial, so that 

c{x,y) = b{xy)/b{x)b{y), (5.285) 

for some function : G —>■ T satisfying b{e) = 1. From (5.241), i.e., c{e,x) = 1, we 
infer that if x = 5 G D, so that n{5) = e, then c{5,y) = 1, and hence 
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b{5y)=b{5)h{y). 


(5.286) 


Taking x and y both in D, we see that b^D is a character, which we call X- Hence 


c{x,y) 


b(s(x)s(y)) 

b(s(x))b(s(y)) 

i’(Hxy)) 

b(s(x))b(s(y)) 


Hsjxy)) 

b{s{x))b{s{y)) 


■Cxix,y), 


b{s{x)s{y)) 

b{s{xy)) 


since, using (5.286) with 5 s(x)s(y)s(.t 3 ') ' and y we have 


(5.287) 


b{s{xy)) b{s{xy)) 

= = cx{x,y). 


Thus [c] = [cj(], and hence the map (5.281) - (5.282) is surjective. □ 


Definition 5.58. In the situation and notation of Theorem 5.41, a unitary represen¬ 
tation u:G (H) is called admissible if u{D) c T • Ijj. 

In that case, there is obviously a character X G D such that for each 5 G Dwe have 


u{5)=x{d)-lH- 


(5.288) 


Unitary irreducible representations are admissible, since Schur’s Lemma implies 
that, since D lies in the center of G, its image u{D) consists of multiples of the unit. 
If u is admissible, we obtain a homomorphism (5.231) by means of 


h = Kouos, 


(5.289) 


where s is any cross-section of ft, cf. (5.278) - (5.279). Note that different choices 
s, s' are related by s'(x) = s(x)5(x), where 5 ; G -7 D is some function, so that 

h'(x) = 7t{u{s'{x))) = %{u{s{x))u{5{x))) = %(u{s{x)))%{5{x) ■ \h) = h{x). 

Theorem 5.59. 7. IfG is a connected Lie group with = 0, any homomor¬ 

phism h: G ^ U (H)/T as in (5.231) comes from some admissible unitary rep¬ 
resentation u of G by (5.289). IfH is separable, then h is continuous iff ii is. 

2. Moreover, if u{G) is super-admissible in that u{5) — Ih for each 5 G D, then 
u = uos is a unitary representation of G, in which case h = Ttou therefore comes 
from a unitary representation of G itself. 

Proof. Given such a homomorphism h, pick a cross-section s : U (H) /T (77), as 
in (5.234), with associated 2-cocycle c on G given by (5.239). By Theorem 5.57 and 
its proof, we may assume (possibly after redefining s) that there exists a character 
X G D and a cross-section (5.278) such that c = c^, cf. (5.280). We then define 
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u-.G^B{H)-, (5.290) 

X !->■ x(x- (so 7i{x))^^)u{7i{x)). (5.291) 

Simple computations then show that x- (so jr(x))^* G D (i.e., the center of G), that 
(5.288) holds, that each operator u{x) is unitary, that the group homomorphism prop¬ 
erties u{x)u{y) = u{^) and u{e) = \h hold, and that (5.289) is valid. As to the last 
equation, since n removes the term with % in (5.291), and u = soh, we have 

nouo s(x) = no sohofto s(x) = h{x), 

since ;r o s = id (on U(H) /T) and ir o s = id (on G). 

If m(5) = Ih for each 5 G D, then = I from (5.280), so that u{x)u{y) = Uxy 
by (5.238). If s preserves units, or, equivalently, if he = Ih, as we always assume, 
we see that m is a unitary representation of G. In this case, (5.291) simply reads 
u = sohoft. This immediately yields u = uoft, which in turn gives u — uos. 

Finally, even if h is continuous, it is a priori unclear if u is, since the cross- 
sections s and s appearing in the above construction typically fail to be continuous. 
Fortunately, since they are assumed measurable, there is no question about measur¬ 
ability of M, and if H is separable, continuity follows from Proposition 5.36. □ 

Corollary 5.60. If G is a connected Lie group with covering group G, the formulae 

M = Mojr; (5.292) 

u = uoS, (5.293) 

where s : G ^ G is any cross-section of the covering map ft: G^ G, give a bijective 
correspondence between (continuous) super-admissible unitary representations u of 
G and (continuous) unitary representations u of G, preserving irreducibility. 

Corollary 5.61. Any homomorphism h: SO(3) —>■ U (H) /T as in (5.231) comes from 
an admissible unitary representation u ofSU (2) by (5.289). Moreover, h comes from 
a unitary representation u = iio s of SO{3) itself iff ii is trivial on the center Z 2 . 

In particular, if h is irreducible, it must come from the unitary irreducible rep¬ 
resentations u = Dj, where j — 0, ^,1,... is the (half-) integer spin label. Then 
Dj{SU{2)) is super-admissible iff j is integral, in which case it defines a unitary 
irreducible representation of SO{3). 

Indeed, the assumption //^(g,]R) = 0 in Theorem 5.59 is satisfied for SO{3) be¬ 
cause of Whitehead’s Lemma 5.54. The case where f 0 occurs e.g. for 

the Galilei group (cf. §7.6). It can be shown that has finitely many gen¬ 

erators, for which one finds pre-images (^ 1 ,..., (Pm) in Z^(p,K), with correspond¬ 
ing elements (fl,... ,1^) of Z^(G,]R), cf. the proof of Theorem 5.55. Of these, a 
subset (G N < M, satisfies the relation FfSjx) = rfxjS) for any 5 G D 

(cf. Theorem 5.41) and x G G. This yields a map F : G x G ^ given by 
rix,y) = (Fi{x,y),...,FN{x ,y)), which in turn equips the set 

G = G X (5.294) 
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with a group multiplication (x, v) • (y,w) = {xy,v + w + r{x,y)). We then have the 
following generalization of Theorem 5.59, in which a unitary representation m of G 
is called admissible if u{5,v) G T - 1// for any 5 G D and v G 

Theorem 5.62. Let G be a connected Lie group, and H a separable Hilbert space. 
Then any continuous homomorphism h:G^ U (H) /T comes from some admissible 
continuous unitary representation u of G. 

As we only apply this to the Galilei group (where N = 1), basically only for illus¬ 
trative purposes, we omit the proof. The correct (and natural) notion of equivalence 
of projective representations is as follows: we say that two such homomorphisms 
hi'.G^U {Hi) /T, i= 1,2 are equivalent if there is a unitary w : H\ H 2 such that 

AA„{hi{x)) = h 2 {x), xGG, (5.295) 

where Ad„,: U{H\)/T -G t/(// 2 )/T is the map [u] 1 — >■ [vmv*], which is well defined 
(here [u\ is the equivalence class of uGU (H) in U (H) /T under u^ zu, z G T). 

This induces the following notion for G: two admissible unitary representations 
Ml, M 2 of G on Hilbert spaces Hi,H 2 are equivalent if there is a unitary w:H\ H 2 
and a map : G —>■ T such that wmi {x)w* = b{x)u 2 {x), for any x G G. It can be shown 
that such a map b always comes from a character j : G —>■ T through b{x, v) = x(x). 

To close this long and difficult section, in relief it should be mentioned that the 
above theory vastly simplifies if H is finite-dimensional. By Theorem 5.40, this is 
true, for example, if G is compact and u is irreducible. Suppose u : G ^ U (H) is 
merely a projective unitary representation of G, so that instead of (5.157) one has 

[m'(A),m'(T)] = u'{[X,Y]) + i(p{X,Y) ■ Ih, (5.296) 

where (p is given by (5.267). Taking the trace yields 

(p{X,Y) = -Tv{u'{[X,Y])), (5.297) 

n 

where n = dim(//) < 00 . We may define a linear function 0 : g -> K by 

e{X) = -Tx{u'{X)), (5.298) 

n 

so that (p{X,Y) = 0([A,T]), cf. (5.253), and hence we may remove (p by redefining 

u'{X) =u'{X)+ie{X)-lH, (5.299) 

which satisfies (5.157) - (5.158). Hence by Corollary 5.43 the map m' exponentiates 
to a unitary representation m of the universal covering group G of G; it should be 
checked from the values of m on D if m also defines a unitary representation of G. 
This argument shows that finite-dimensional projective unitary representations of 
Lie groups always come from unitary representations of the covering group. 
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5.11 Position, momentum, and free Hamiltonian 

The three basic operators of non-relativistic quantum mechanics are position, de¬ 
noted q, momentum, p, and the free Hamiltonian ho. Assuming for simplicity that 
the particle moves in one dimension, these are informally given onH = L? (K) by 

qYix) = xYix)', (5.300) 

pij/ix) = —ih —</(x); (5.301) 

ax 

h^ S 

hoYix) =- — ^Wix), (5.302) 

Zm ax^ 

where m is the mass of the particle under consideration. We put h = l and m = 1/2. 

The issue is that these operators are unbounded; see §B.13. In general, quantum- 
mechanical observables are supposed to be represented by self-adjoint operators, 
and examples like (5.300) - (5.302) show that these may not be bounded. The 
Hellinger-Toeplitz Theorem B.68 then shows that it makes no sense to try and ex¬ 
tend the above expressions to all of L^(R), so we have to live with the fact that some 
crucial operators a : D{a) —> H are merely defined on a dense subspace D{a) C H. 

Each such operator has an adjoint a* : D{a*) H, whose domain D{a*) C H 
consists of all xj/ G H for which the functional (p {\i/,a(p) is bounded on D{a), 
and hence (since D{a) is dense in H) can be extended to all of H by continuity 
through the unique “Riesz-Frechet vector” x for which {\i/,a(p) = {%,(?)■ Writing 
X = a*\lf, for each y/ G D(a*) and (p G D(a) we therefore have 

{a*\j/,(p) = {\j/,a(p). (5.303) 

Assuming that D{a) is dense in H, we say that a is self-adjoint, written a* = a, if 

{a(p,\x) = {(p,a\l/), (5.304) 

for each xj/,(p G D{a) and D{a*) = D{a). A self-adjoint operator a is automatically 
closed, in that its graph G{a) = {(\i/,a\j/) \ \j/ G D(a)} is a closed subspace of the 
Hilbert space H(BH (indeed, the adjoint of any densely defined operator is closed, 
see Proposition B.72). In practice, self-adjoint operators often arise as closures of 
essentially self-adjoint operators a, which by definition satisfy a** = a*. Equiva¬ 
lently, such an operator is closable, in that the closure of its graph is the graph of 
some (uniquely defined) operator, called the closure a^ of a, and furthermore this 
closure is self-adjoint, so that a^ = a*. If a is closable, the domain D{a^) of its 

closure consists of all y/ G // for which there exists a sequence ()//„) in D{a) such 

that xj/n^ xj/ and aXj/„ converges, on which we define a^ hy a^xj/ = lixxi„ aXj/„. 

The simplest case is the position operator. 

Theorem 5.63. The operator q is self-adjoint on the domain 


D{q) = {xj/GL^{R)\ f dxx^\xj/{x)\^ <oo}. (5.305) 

Jr 
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See Proposition B.73 for the proof. To give a convenient domain of essential self¬ 
adjointness (also for the other two operators), we need a little distribution theory. 

Definition 5.64. The Schwartz space (whose elements are functions of 

rapid decrease) consist of all smooth function /: M —>■ Cfor which each expression 

11 /I I «,m = sup{ (x) I, X G M}, (5.306) 

where is the m’th derivative of f, is finite. The topology of is given by 

saying that a sequence (ornet) fx converges to f iff \\fx — f\\n,m 0 for all n,m G N. 

Each II • ||„ m happens to be a norm, but positive definiteness is nowhere used in the 
theory below (which therefore works for families of seminorms, which satisfy the 
axioms of a norm expect perhaps for positive definiteness). Since there are countably 
many such (semi)norms defining the topology, we may equivalently say that (M) 
is a metric space defined by 


d{f,g) 


^ 2 -« 11/ 8\\n,m 

n,m=0 ^ + \\f~8\\n,m 


(5.307) 


Indeed, ,5^(K) is complete in this metric. A typical element is /(x) = exp(—x^). 

Definition 5.65. A tempered distribution is a continuous linear map (p : SA (K) — 

C. The space of all such maps, equipped with the topology ofpointwise convergence 
(i.e., (px ^ (p ifftpxif) ^(/) for each f G .5^(K)) is denoted by c5^'(R). 

It can be shown that (because of the metrizability of (R)) continuity is the same 
as sequential continuity, i.e., some linear map (p : — >■ C belongs to y'{^) iff 

\imN (p{fN) = (p{f) for each convergent sequence /A^ —>■ / in Like 

the tempered distributions ,5^'(R) form a (locally convex) topological vector space, 
that is, a vector space with a topology in which addition and scalar multiplication 
are continuous. The topology of is given by a family of seminorms, namely 

ll^ll^ = |^(/)|, / G c5^(R), and hence a simple way to prove that (p G y'(M.) is to 
find some (n,m) for which |^(/))| < C ||/||H,m for each / G since in that case 

fist — >■ /, which means that ||/a^ — f\\n,m —>■ 0 for all n,m G N, certainly implies that 
^(/a') ^(/)> so that (p is continuous. For example, the evaluation maps 5x defined 

by dx{f) = /(x) are continuous (take n = m = 0). Similarly, each finite measure on 
R defines a tempered distribution. Taking the (0,m) seminorm shows that the maps 
f h-(x) for fixed m G N and x G R are tempered distributions. 

A less obvious example (defining a so-called Gelfand triple) is as follows: 

Proposition 5.66. We have continuous dense inclusions 

,^(R)CL^(R)C.5^'(R), (5.308) 

where the second inclusion identifies (p G L^(R) with the map 

f^{(p,f)= [ dx(p(x)fix). (5.309) 

7r 
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Proof. As vector spaces, the first inclusion is obvious. For / G .5^ (K) we estimate 


f dx\f{x)\-\f{x)\ 

Jr 

/ 

Jr 


<ll/llil 


|i = / dx 
< 71 


(l+x^)|/(x)| 
1 +x^ 

0,0 + ll/lb,o), 


Jr 


- j-:p^ll(i+»*x2)/ll» 


(5.310) 


(5.311) 


so that, noting that 


110,0 = 


o, we have 


ll/llB?^(ll/IU+ll/ll 2 ,o)||/lk. (5.312) 

Hence fi f in which incorporates the conditions \\fi —/||o.o 0 

and Wfi —/Ib.o —^ 0, implies \\fx — /lb —^ 0. This shows that the first inclusion 
in (5.308) is continuous. Density may be proved in two steps. First, take some 
fixed positive function h G C/(—1,1) with the property / dxh{x) = 1, and define 
hn{x) = nhfix), so that informally /r„ G Cf (K) converges to a 5-function as n —>■ oo. 
For each xj/ G we consider the convolution h„ * y/, where for suitable f,g, 

f*g{x)= f dyf{x-y)g{y). (5.313) 

Jr 

Then hn*yf G C")®) nL^(R) and, from elementary analysis, \\h„ * i//— v/|| —>■ 0. 

Second, for yr G Cc(R), the functions hn*yf lie in C/(K) and hence in +^(R). 
Since Cc(R) is dense in L^(R) by Theorem B.30, for y/ G L^(R) and e > 0 we 
can find (p G Cc(R) such that ||V7— ^|| < e/2, and (as just shown) find n such that 
11^ — II < e/2, whence || y/ — ^„|| < e. This proves that +^(R) is dense in L^(R). 
The second inclusion is continuous by Cauchy-Schwarz, which gives 

|<p(/)|<||<p|bll/lb, 

to be combined with (5.312). It should be noted that also the second inclusion in 
(5.308) is indeed an injection, i.e., that (p{f) = 0 for each / G c5^(K) implies (p — 0 
in L^(R); this is true because +^(K) is dense in L^(K), plus the standard fact that, in 
any Hilbert space H, if {(p,f) = 0 for all / in some dense subspace of H, then (p = 0. 
Finally, the fact that L^(K) is dense in the seemingly huge space follows 

from the even more remarkable fact that +^(K) is dense in +^'(]R). On top of the 
functions just defined, also employ a function x G C/(]R) such that x{x) = 1 on 
(—1,1), and define Xn{x) = xixjn), so that informally lim„^ooX(x) = 1 (as opposed 
to the hn, which converge to a 5-function as n —oo). If for any g G c5^(R) and any 
(p G we define g(p as the distribution that maps / G c5^(]R) to (p{fg), and 

similarly define g * ^ as the distribution that maps / to <p(g* /), we may define a 
sequence of distributions (p„= hn* {Xn(p)- From the point of view of (5.308), these 
correspond to functions (pn G +^(]R) in the sense that (Pn{f) = f dx<p„(x)f(x), where 
/ G +^(]R). Using similar analysis as above, it then follows that for any / G c5^(]R) 
we have (p„{f) —>■ (p{f), so that y)„ ^ yxin +^'(]R). □ 
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For our purposes, the point of all this is that we can define generalized derivatives 
of (tempered) distributions, and hence, because of (5.308), of functions in L^(]R). 

Definition 5.67. For (p € and m € N, the m’th generalized derivative 

is defined by 

<p'”^(/) = (-ir<P(/<'”>)- (5.314) 

The idea is that under (5.308) this is an identity if ^ G (partial integration). 

Like the constructions at the end of the proof of Proposition 5.66, this is a special 
case of a more general construction: whenever we have a continuous linear map 
T : ,5^(R) —>• we obtain a dual continuous linear map T' : o5^(K)' — >• 

defined by T'<p = (poT, i.e.. 


(r»(/) = (p(r(/)). (5.315) 

Sometimes a slight change in the definition (as in (5.314), or as in the Fourier trans¬ 
form below) is appropriate so that the restriction of T' to ,5^(K) coincides with T. 

Theorem 5.68. The momentum operator p = —id/dx is self-adjoint on the domain 

D(p) = {v/GL^(K) I r'GL^(K)}, (5.316) 

where the derivative \j/' is taken in the distributional sense (i.e., letting if/ G 

Proof. We first show that p is symmetric, or p C p*. This comes down to 

{V,(p) = -{W,(p'), (5.317) 

for each \j/,(p G D{p), where both derivates are “generalized”. The most elegant 
proof (though perhaps not the shortest) uses the Sobolev space (K), which equals 
D{p) as a vector space, now equipped, however, with the new inner product 

{W,9){\) = {W,9) + {W' ,9'), (5.318) 

with both inner products on the right-hand side in L^(K); the associated norm is 

llrllfi) = llrf + llr'f. (5.319) 

Similar to the Gelfand triple (5.308), we have dense continuous inclusions 

C.5^'(K), (5.320) 

with analogous proof. All we need for Theorem 5.68 is the first inclusion of the 
triple (5.320): for \j/ G //'(K) we now have h„*\j/ G C°°(R) n//^(K) as well as 
h„* Xj/ Xj/ in (K), both of which follow from the L^-case plus the identity 

ihn*xi/y = h„*xi/. (5.321) 

Using the same cutoff function x as in the case, we have XnW ^ ¥ and Xn¥ ^ 
0 in L^(]R), so that (XnVY V' in and hence XnV V also in //'(R). 
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Furthermore, the functions \j/n= h„* (XnW) lis in hence in ,5^(K); using 

the above facts we obtain )//„—!> y/ in (K). In sum, for each Xj/ G (K) we can 
find a sequence (v/„) in ^{M.) such that \j/n ^ \j/ and t//' —^ Xf/' in L^(K). Hence 

(V/,^') =lim(v/„,^') = -lim(v4,^) = -{V,(p')- (5.322) 

n n 

For the converse, let xj/ G D{p*), so that by definition for each (p G D{p) we have 

{p*x^,(p) = {xif,p(p) = -i{w,(p')- (5.323) 

Since ,5^(K) C D{p), this is true in particular for each (p G ^(R), in which case 
the right-hand side equals —ixp^'((p), where the derivative is distributional. But this 
equals {p*xi/,(p) and so the distribution —ixj/’ is given by taking the inner product 
with p*xif G L^(]R). Hence —iy/' = p*xif G L^(]R), and in particular y/' G L^{R), so 
that Xj/ G D{p). This proves that D{p*) C D{p), and since from the first step we have 
the oppositie inclusion, we find D{p*) = D{p) and p* = p. □ 

For the free Hamiltonian ho = — A with A = jdx^, we similarly have: 

Theorem 5.69. The free Hamiltonian ho = —A is self-adjoint on the domain 

D{A) = {xj/ G L^{R) I Xj/' GL^iR)}, (5.324) 

where the double derivative xj/' is taken in the distributional sense. 

Although this may be proved in an analogous way, such proofs are increasingly 
burdensome if the number of derivatives gets higher. It is easier to use the Fourier 
transform (which also provided an alternative way of proving Theorem 5.68). 

Theorem 5.70. The formulae 

/« =/” (5.325) 

f{x)= (5.326) 

are rigorously defined on S^{R), L^(K), and ./^'(R), and provide continuous iso¬ 
morphisms of each of these spaces. Furthermore, (5.326) is inverse to (5.325), i.e. 

/ = / = /, (5.327) 

so that we may (and often do) write f = ^(f) and f = or f = ,^^* (/). 

In all three cases we have the identities (in a distributional sense if appropriate) 

= {id/dk)’'{ik)”'^{f){k). (5.328) 

Finally, as a map ^ : L^(K) —>■ L^(K) the Fourier transform is unitary, so that 

{xj/,(p) = {xj/,<p). (5.329) 


“PuytJC. T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



182 


5 Symmetry in quantum mechanics 


See §C.15 for further discussion. For example, we have 

D{p) = {v/GL^(K) I /t-v/(/t) GL2(K)}; (5.330) 

D{A) = {v/GL^(K) I vf(;t) gL^(K)}. (5.331) 

Thus we may now reformulate Theorems 5.68 and 5.69 as follows: 

Theorem 5.71. The momentum operator p is self-adjoint on the domain (5.330). 
The free Hamiltonian ho = —A is self-adjoint on the domain (5.331). 

Proof Denoting multiplication by x" by the symbol k", we have 

p = (5.332) 

4 = -. 9 -^ 1 ^^. (5.333) 

Hence the theorem follows from Proposition B.73 and unitarity of the Fourier trans¬ 
form ^ (plus the little observation that if a = a* on D{a) C H and u : H K is 
unitary, then b = uau* is self-adjoint on D{b) = uD{a) C K). □ 

Much is known about regularity properties of functions in such domains, e.g., 

D{p) c Co(M); (5.334) 

D(4) c C^'^(]R). (5.335) 

These are the most elementary cases of the famous Sobolev Embedding Theorem. 

If V^GZ)(p),thenfci->- (1is inL^(]R), so applying Holder’s inequal¬ 
ity (B.15) with p = q = 2 to f{k) = (I and g{k) = (1 which 

is inL^(]R), too, gives t/r G L* (R). The Riemann-Lebesgue Lemma (see §C.15) then 
yields y/ G Co(IR). To prove (5.335), one uses (1 -\-k^) rather than its square root. 
Finally, we give a common domain of essential self-adjointness for q, p, and ho. 

Proposition 5.72. The operators q, p, and ho are essentially self-adjoint on 

Proof. We see from (5.332) that the cases of p and q are similar, so we only explain 
the case of q. Denoting the operator of multiplication by x on the domain (R) by 
qo, as in the proof of Proposition B.73 it is easy to see that D{q'jf) = D{q). Fourier- 
transforming, the fact that y (R) is dense in (R) (cf. the proof of Theorem 5.68) 
shows that D{qQ) = D{q),so that D{qQ) = D{qQ). The actions of and q^ obvi¬ 
ously being given by multiplication by x in both cases, we have q*Q= q^. 

The proof for ho is similar; in the second step we now use the fact that y (R) is 
dense in //^(R), defined as D{A), as in (5.324), but now seen as a Hilbert space in 
the inner product {yf, <p)(^ 2 ) = iWt (P) + (¥''t (P")’ with corresponding norm given by 
II V^ll^ 2 ) “ II *^11^ II V^”ll^' proved just as in the case of a single derivative. □ 

We also say that ..^(R) is a core for the operators in question. For example, the 
canonical commutation relations [q^p] = ih- \h rigorously hold on this domain. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



5.12 Stone’s Theorem 


183 


5.12 Stone’s Theorem 

We now come to a central result on symmetries in quantum mechanics “explaining” 
the Hamiltonian. Recall that a continuous unitary representation of K (as an additive 
group) on a Hilbert space H is a map 1 1 —where f G K and each m, G B(H) is 
unitary, such that the associated map M x // —!> //, (f, yr) y/, is continuous, and 

UgUt = Us+t, s,t G K; (5.336) 

MO = 1//; (5.337) 

limut\j/ = y/ (f G K, G H). (5.338) 

t^o 

These conditions imply 

limutXj/= UsY (s,f G K, G//). (5.339) 

t—s-s 

Note that according to Proposition 5.36 continuity may be replaced by weak mea¬ 
surability. Probably the simplest nontrivial example is given hy H — L^(R) and 

Utyr{x) = yr{x — t). (5.340) 

To prove (5.338), we use a routine e/3 argument. We first prove (5.338) for 
\j/ G Cc(R), where it is elementary in the sup-norm, i.e., lim,^o ||Mri/— V^||oo = 0 
by continuity and hence (given compact support) uniform continuity of y/. But then 
the (ugly) estimate ||v ^||2 < where fG C R is any compact set containing 

the support of y/, also yields lim,^0 \\uty/— yf \\2 = 0. Hence for e > 0 we may find 
5 > 0 such that \\utyf— yr\\2 < e/3 whenever |f| < 5. For general y/' G H, we find 
yf G Ce(]R) such that \\yf— yf'\\ < e/3, and, using unitarity of Ut, estimate 

\\utyf'-yf'W < \\utyf'-utyf\\ + \\utyf-yf\\ + \\yf-yf'\\ 

< e/3-l-e/3-l-e/3 = e. 

In the context of quantum mechanics, physicists formally write 

Ut=e-'‘\ (5.341) 

where a is usually thought of as the Hamiltonian of the system, although in the 
previous example it is rather the momentum operator. In any case, we avoid the 
notation h instead of a here, partly in order to rightly suggest far greater generality 
of the construction and partly to avoid confusion with the notation in §B.21; if h is 
the Hamiltonian, one would have a = hjh in (5.341). Mathematically speaking, if a 
is self-adjoint, eq. (5.341) is rigorously defined by Theorem B.158, where 

et{x) = exp{—itx). (5.342) 
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Conversely, given a continuous unitary representation 11—of M on H, one may 
attempt to define an operator a by specifying its domain and action by 

D{a) = I y/ € // I lim ^- Xj/ exists 1; (5.343) 

( s J 

a\i/ = i\im— - Xj/{xj/£ D{a)). (5.344) 

r-5-O S 

Stone’s Theorem makes this rigorous, and even turns the passage from the generator 
a to the unitary group t ^ Ut (and back) into a bijective correspondence. 

Theorem 5.73. 1. Ifa:D{a) -4 H is self-adjoint, the map t ^ Ut defined by (5.341), 
which is rigorously defined by Proposition B.159 with (5.342), defines a contin¬ 
uous unitary representation o/R on H. 

2. Conversely, given such a representation, the operator a defined by (5.343) - 
(5.344) is self-adjoint; in particular, D{a) is dense in H. 

3. These constructions are mutually inverse. 

Proof We use the setting of §B.21, so that b is the bounded transform of a. 

1. Eqs. (5.336) - (5.337) are immediate from Theorem B.158, which also yields 
unitarity of each operator Ut. To prove (5.338) we first take (p £ C* {b)H, which 
means that ^ is a finite linear combinations of vectors of the type (p =h{a)xjf, 
where h £ Cc{<y{a)) and xj/ £ H. Using (5.342) and (B.573), we have 

\\ut(p-(p\\ < \\e,h-hUW\\ < ||/r|U|k,-l^|lL^)||V/||, (5.345) 

where K is the (compact) support of h in cy{b). Since the exponential function 
is uniformly convergent on any compact set, this gives lim^^O ~ 9|| = 0. 
Taking finite linear combinations of such vectors (p gives the same result for any 
(p £C* {b)H (with an extra step this could have been done on Cq {b)H, too). 

Thus for e > 0 we can find 5 > 0 so that \\ut(p — ^|| < e/3 whenever |f | < 5. For 
general Xf/' £ H, we find (p £ CQ{b)H such that ||^ — v/|| < e/3, and estimate 

Wutx/-xif’W < Wutx/-ut(p\\-£\\ut(p-(p\\-\-\\(p-xif’\\ 
f e/3 -\- e/3 -t- e/3 — e, 

since \\utXj/' — Ut(p\\ = || y/' — ^|| by unitarity of This is equivalent to (5.338). 

2. For any x^r £ H and n G N, define G // by 

poo 

Xj/n = n (5.346) 

Jo 

either as a Riemann-type integral (whose approximants converge in norm) or as 
a functional (p nf// dse^"’^{usXi/, <p), which is obviously continuous and hence 
is represented by a unique vector G H. Then simple computations show that 

lim^^^^—5-V4 =n(rn-r), 

s-5-0 S 
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SO that 1/4 G D{a). The proof that xf/n ^ xj/ starts with the elementary estimate 

poo 

||V 4 -r||<n/ dse^''^\\u,xi/-xi/\\, 

Jo 

in which we split up the Jq as Jq -f /s’ ’ ’ ’ ’ where 5 > 0. Using strong con¬ 
tinuity of the map t i—?► i.e., (5.338), for any n the first integral vanishes as 

5 -G 0. In the second integral we estimate WusXj/— V/|| < 2|| t//|| and take the limit 
o°. Thus xj/, so that D(a) is dense in H. 

To prove self-adjointness of a, we need a tiny variation on Theorem B.93: 

Lemma 5.74. Let a be symmetric. Then a is self-adjoint (i.e. a* = a) iff 

ran(fl -f i) = ran(fl — i) = H. (5.347) 


Proof. We only need the implication from (5.347) to a* = a (but the converse im¬ 
mediately follows from Theorem B.93). So assume (5.347). For given xj/ G D{a*) 
there must then he a (p G H such that (a* — i)xi/= (a — i)(p. Since a is symmet¬ 
ric, we have D{a) C D(a*), so xj/ — (p G D{a*), and (a* — i){xif — (p) = 0. But 
ker(a* — i) = ran(a -I- /)^, so ker(a* — i) = 0. Hence xj/ — (p, and in particular 
Xj/ G D{a) and hence D{a*) C D{a). Since we already know the opposite inclu¬ 
sion, we have D{a*) = D{a). Given symmetry, this implies a* = a. □ 


Continuing the proof of Theorem 5.73.2, symmetry of a easily follows from its 
definition, combined with the property u* =uf^ = U-f Indeed, for xi/,(p G D{a), 
the weak limit s —^ 0 below exists by definition of D{a), cf. (5.343), whence: 

{(p,axi/) = /hm(^, ——^ t/r) = —/hm(—*— ^(p, t/r) = (a(p, Xj/). 
i->0 s s^O —s 


To prove that ran(a — i) = H, we compute (a — i)xj/i = —ixj/, with xj/i defined by 
(5.346) with n = 1. The property ran(—i) = H is proved in a similar way: now 
define xf/i = f_^dse^UsXjf and obtain {a-\-i)Yi = iW- Thus Lemma 5.74 applies. 

3. Bijectivity has two directions: ai-G- Ut i-G- a and Ut^ u,. 

• Given a and hence (5.341) defining we change notation from a to a' in 
(5.343) - (5.344) and need to show that a' = a. Denoting the restriction of 
a to the domain C* (b) by ao, we first show that aq C a'. The technique to 
prove this is similar to the argument around (5.345). We initially assume that 
(p G D(ao) = C*{b)H takes the form (p = h{a)xjf for some h G Cc((j(a)) and 
xjf G H. Just a trifle more complicated than (5.345), using (5.342), (B.573), 
and unitarity of we estimate: 


Ut+s(p-Ut(P , . 
-h laoUtCp 


< 


< 


eM — h 


s 

G — 1/r 


H-i-idfj/T’jh 

,|W 

+ i ■ id^: 
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SO that by definition of the (strong) derivative we obtain 


dut ut+s(p-ut(p 

——(D = hm-= —laUfCp, 

dt s^o s 


(5.348) 


initially for any (p of the said form h{a)\i/, and hence, taking finite sums, for 
any (p G £>(flo). The existence of this limit shows that, on the assumption 
\j/ G D{ao), we have \j/ G D{a'), and we also see that a' = a on D{ao), or, in 
other words, that aq C o'. Since a' is self-adjoint (by part 2 of the theorem) and 
hence closed, we have C a'. Since aq is essentially self-adjoint by Theorem 
B.159, this gives a C a'. Taking adjoints reverses the inclusion, and since both 
operators are self-adjoint this gives a = d. 

• Given m, and hence (5.343) - (5.344) defining a, we change notation from m, 
to in (5.341) and need to show that = «,. Indeed, let 


Xj/, = Utlj/, 


(5.349) 


and similarly y// = u^xj/. If xj/ G D{a), then by definition of a we have 


.dWt ... ut+s 
i—— = ilim 
dt 


i-5-0 


tXf ... Us 1 // 

— y/ = nim- UfX^f = axj/t, 

S s-5-0 s 


(5.350) 


which also shows that xj/t G D{a). Similarly, idxif'Jdt = ay//, so that xj/i and y// 
satisfy the same differential equation with the same initial condition 

^(0) ^ (^{0)y ^ ^ 


Now consider xffi = xi/t — y//, which once again satisfies the same equation (i.e., 
idx^t/dt = axfrt), but this time with initial condition i//q = = 

Xj/ — Xj/ = 0. The key point is that any solution Xj/t of this equation has the 
property ||t//,|| = ||v/6|| for any f G M, since by symmetry of a, 

= -i({Whaxj/t} - {axj/t,W =0. 

For our specific Yt we have || V/q|| =0 and hence xj/i = y//, that is, u[ = ut. □ 

Corollary 5.75. With t i-g- ut and a defined and related as in Theorem 5.73, if xjf G 
D{a), for each t G K the vector xj/t defined by (5.349) lies in D{a) and satisfies 

axj/, = i^, (5.351) 

at 

whence t i-G Xj/i is the unique solution of (5.351) with initial value Xj/^^'i = Xj/. 

This follows from the proof of part 3 of Theorem 5.73. With a = h/h (as above), 
this is just the famous time-dependent Schrodinger equation 

hxj/, = ih'^. (5.352) 

at 
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Notes 

§5.1. Six basic mathematical structures of quantum mechanics 

Wigner’s Theorem was first stated by von Neumann and Wigner (1928), but the 
first proof appeared in Wigner (1931). See Bonolis (2004) and Scholz (2006) for 
some history. Instead of working with (H) with the bilinear trace form express¬ 
ing the transition probabilities, one may also formulate and prove Wigner’s Theorem 
in terms of the projective Hilbert space ¥H equipped with the Fubini-Study metric, 
in which case the relevant symmetries may be defined geometrically as isometries. 
See Freed (2012) for this proof, as well as Brody & Hughston (2001) for the un¬ 
derlying geometry. Kadison’s Theorem may be traced back from Kadison (1965). 
See also Moretti (2013). Ludwig symmetries go back to Ludwig (1983); see also 
Kraus (1983). Our approach to von Neumann symmetries was inspired by Hamhal- 
ter (2004), and has a large pedigree in quantum logic. Bohr symmetries were intro¬ 
duced in Landsman & Lindenhovius (2016), where Theorem 5.4.6 was also proved. 

§5.2. The case H = C^ 

This material is partly based on Simon (1976). The covering map (5.46) has a 
nice geometric description; if L = C U {0°} is the Riemann sphere, we have the 
well-known stereographic projection 


4 L; (5.353) 

{x,y,z) H> (5.354) 

1 —z 

If M G SU (2) is given by (5.43), then the associated Mobius transformation 

az + j5 


is a bijection of E, whose associated transformation of S'^ is the rotation R = 7t{u). 

§5.3. Equivalence between the six symmetry theorems 

Most proofs may be also found in Cassinelli et al (2004) or Moretti (2013). 

§5.4. Proof of Jordan’s Theorem 

Our proof of Jordan’s Theorem is taken from Bratteli & Robinson (1987); see 
also Thomsen (1982) for a simplification of the purely algebraic step (which we 
delegated to Theorem C.175), originally proved by Jacobson & Rickart (1950). 
§5.5. Proof of Wigner’s Theorem 

There are many proofs of Wigner’s Theorem, none of them really satisfactory 
(in this respect the situation is similar to Gleason’s Theorem). Our proof follows 
Simon (1976), who in turn relies on Bargmann (1964) and Hunziker (1972). The 
proof in Cassinelli et al (2004) seems cleaner, but their proof of the additivity of 
their operator T® is not easy to follow. For a geometric approach see Freed (2012). 
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If dim(//) > 3, the conclusion of Wigner’s Theorem follows if W merely pre¬ 
serves orthogonality (Uhlhorn, 1963). See also Cassinelli et al (2004). This, in turn, 
has been generalized in various directions, e.g. to indefinite inner product spaces 
(Molnar, 2002) as well as to certain Banach spaces, where one says that x is orthog¬ 
onal toy if for all A S C one has ||x-|-Ay|| > ||x|| (Blanco & Tumsek, 2006). 

§5.6. Some abstract representation theory 

Among numerous books on representation theory, our personal favourite is Barut 
& Ragka (1977), and also Gaal (1973) and Kirillov (1976) are classics at least for 
the abstract theory. An interesting recent paper on the unitary group on infinite¬ 
dimensional Hilbert space is Schottenloher (2013). 

§5.7. Representations of Lie groups and Lie algebras 

This section was inspired by Hall (2013) and Knapp (1988). For Lie’s Third The¬ 
orem, see, for example, Duistermaat & Kolk (2000), §1.14. To obtain Theorem 5.41, 
consider the canonical projection ft :G^ G and define D = ({e}). This is a dis¬ 

crete normal subgroup of G, and it is an easy fact that a discrete normal subgroup of 
any connected topological group must lie in its center. Note that a discrete subgroup 
of the center of G is automatically normal. 

The exponentiation problem for skew-adjoint representations of g is consider¬ 
ably more complicated than in finite dimension. Let H be an infinite-dimensional 
Hilbert space with dense subspace D and let p ; g L{D,H) be a linear map, where 
L{D,H) is the space of linear maps from L to H. We say that p is a skew-adjoint 
representation of g if (i): D is invariant under u'{q), (ii): the commutation relations 
(5.157) hold on D, and (i): each /p(A) is essentially self-adjoint on D. For example, 
we have seen that if m : G —> 1/ (H) is a unitary representation, then the construction 
p (A) = u'(A), defined on the Garding domain D = Dq, fits the bill. Conversely, ad¬ 
ditional conditions are needed for p to exponentiate to a unitary representation. The 
best-known of those is Nelson’s criterion: if, given a skew-adjoint representation 
p ; g L(D,H), the Nelson operator or Laplacian A = is essen¬ 

tially self-adjoint on D, then p exponentiates to a unitary representation of G (with 
additional remarks similar to those in Corollary 5.43). 

§5.8. Irreducible representations of SU(2) 

§5.9. Irreducible representations of compact Lie groups 

See e.g. Knapp (1988), Simon (1996), and Deitmar (2005), and innumerable 
other books. This material ultimately goes back to (E.) Cartan and Weyl. 

§5.10. Symmetry groups and projective representations 

See Varadarajan (1985), Tuynman & Wiegerinck (1987), Landsman (1998a), 
Cassinelli et al (2004), and Hall (2013). For different proofs of Theorem 5.59 
(Bargmann, 1954) see Simms (1971) and Cassinelli et al (2004). Leaving out the 
anti-unitary symmetries is a pity; see e.g. Freed & Moore and Roberts (2016). 
§5.11. Position, momentum, and free Hamiltonian 
§5.12. Stone’s Theorem 

See Reed & Simon (1972), Schmudgen (2012), Moretti (2013), Hall (2013), and 
many other books. Our proof of part 1 of Theorem 5.73 is original. 
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Chapter 6 

Classical models of quantum mechanics 


This chapter gives an introduction to a chain of results attempting to exclude deeper 
layers underneath quantum mechanics that restore some form of classical physics: 

‘ [Such results] more or less illustrate the ways along which some opponents might hope to 
escape Bohr’s reasonings and von Neumann’s proof and the places where they are danger¬ 
ously near breaking their necks.’ (Groenewold, 1946, p. 454) 

In so far as they are mathematically precise, such no-go results have their roots 
in von Neumann’s 1932 book, which gave rise to two traditions that were often 
in polemical opposition to each other. Mathematically minded authors typically 
admired von Neumann’s exclusion of hidden variables, yet tried to strengthen his 
theorem by weakening its assumptions; this sparked, for example, Gleason’s Theo¬ 
rem (1957) as well as the Kochen-Specker Theorem (1967). Certain physicists (led 
by Bell), on the other hand, tried to circumvent (and later even ridicule) von Neu¬ 
mann’s work. A high point of this tradition was Bell’s Theorem from 1964, which 
was informed not only by von Neumann, but even more so by the famous Einstein- 
Podolsky-Rosen (epr) paper from 1935, as well as by Bohm’s deterministic pilot 
wave reformulation of quantum mechanics (1952). However, at the end of the day 
these traditions turned out to be not really divergent after all: Bell not only indepen¬ 
dently (and earlier) obtained a version of the Kochen-Specker Theorem, but, more 
importantly, his results from 1964 turn out to be very closely related to the culmina¬ 
tion of the first tradition in the form of the so-called Free Will Theorem (fwt), which 
was published by Conway and Kochen during 2006-2008. Indeed, although its va¬ 
lidity is uncontroversial, this theorem has been criticized on the following grounds: 

1. Lack of novelty compared with the famous paper by Bell (1964), whose assump¬ 
tions and conclusions are at least quite similar to those of the FWT (although the 
underlying proofs are mathematically quite distinct from those in the FWT). 

2. Lack of novelty even within its own terms: versions of the FWT had actually been 
around for decades under less illustrious titles and authorships, e.g. Heywood & 
Redhead (1983), Stairs (1983), Brown & Svetlichny (1990), and Clifton (1993). 

3. Circularity, in that indeterminism is presupposed (namely in the assumption that 
‘experimenters have a certain freedom’) instead of derived. 
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One aim of this chapter is to clarify these matters, with the following conclusions: 

1. The difference between earlier literature in the same direction and the FWT is 
largely one of emphasis, namely on free will (!), exemplifying a recent trend 
(also found elsewhere) in emphasizing free choice of the settings of experiments. 
Unfortunately, like Bell, Conway and Kochen even mathematically use an infor¬ 
mal way of talking about free settings, not to speak of the complete absence of 
any serious philosophical analysis of free will among all three authors (for which 
perhaps Bell, but certainly not Conway and Kochen may be excused). 

2. Granting the informal characterization of free settings, both Bell’s (1964) The¬ 
orem and the FWT establish a contradiction between quantum mechanics, deter¬ 
minism, and locality (in the sense of Bell, which in the presence of determinism 
reduces to a no-signaling condition called parameter independence). 

3. The technical difference between Bell’s Theorem and the FWT lies in four facts: 

a. Bell’s arguments rely on probability theory (whereas the FWT does not). 

b. The (optical) corner of quantum mechanics used in Bell’s Theorem may be 
replaced by the corresponding experimental results, whereas the FWT uses 
uncontroversial yet untested predictions about massive spin-1 particles. 

c. The FWT must assume perfect (epr) correlations, which are difficult to realize 
and hence are avoided by later versions of Bell’s Theorem (i.e. through the 
CHSH inequalities rather than the original Bell inequalities). 

d. Like EPR, Bell and his followers focused on locality right from the begin¬ 
ning, and hence in Bell (1964) the inference is from locality to determinism. 
Conway and Kochen, on the other hand, resolve the contradiction their FWT 
established by inferring randomness of outcomes from freedom of settings. 

We start with a very simple treatment of both von Neumann’s argument against 
linear hidden variables and Kochen & Specker’s refinement of it, in which von Neu¬ 
mann’s controversial linearity assumption is decisively weakened so as to only apply 
to commuting operators; the Kochen-Specker Theorem excludes what are called 
non-contextual quasi-linear hidden variables. We then present what we see as a 
more transparent version of the FWT, whose key ingredient of replacing the non- 
contextuality assumption in the Kochen-Specker Theorem by a locality condition 
is preserved, but where this time the setting is completely deterministic. Freedom 
of choice then arises as a very natural independence assumption, and any threat of 
circularity is avoided: the conclusion is simply a contradiction between determin¬ 
ism, freedom of choice (i.e. of apparatus settings), locality, and quantum mechanics. 
Moreover, as we argue in §6.3, the philosophically precise concept of free will used 
in the assumptions of the FWT is what Lewis coined ‘local miracle compatibilism’. 

Following an interlude on the GHZ Theorem, which seamlessly fits into the given 
framework, we then turn to Bell’s Theorems, which we compare with the FWT. 

Finally, we give our own rigorous version of an argument first proposed by Col- 
beck and Renner to the effect that, under suitable freeness of choice and no-signaling 
conditions (similar to those in Bell’s Theorem and the FWT), as long as they are 
compatible with quantum mechanics, hidden variables are at best irrelevant. In fact, 
this can only be proved under much stronger assumptions, obscuring the claim. 
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6.1 From von Neumann to Kochen-Specker 

Von Neumann’s Theorem 6.2 below was the first technical result excluding some 
class of hidden variables underneath quantum mechanics, namely (in current par¬ 
lance) linear non-contextual hidden variables. This terminology requires some ex¬ 
planation. First, theorems of this kind apparently accept the mathematical structure 
of the observables prescribed by the usual formalism of quantum theory, i.e., ob¬ 
servables are identified with elements of the self-adjoint part 

//„(C)=M„(C)sa = {aGM„(C) |a*=a} (6.1) 

of the algebra M„(C) of nx n matrices (this simple case suffices to make all points 
of conceptual interest). Short of introducing “hidden” observables, hidden variable 
theories propose the existence of hidden states, which either replace or supplement 
the usual quantum states (which in the case at hand would be density operators). 
Mimicking classical (statistical) physics, such states are interpreted as probabil¬ 
ity measures on some phase space X, whose points x G X assign sharp values to 
quantum-mechanical observables. Naively, this is done through associated functions 

(6.2) 

but in fact this choice already commits us to the first of two possibilities, which we 
pragmatically present as theories predicting measurement outcomes: 

• In non-contextual deterministic theories of measurement, the outcome solely 
depends on the observable a that is being measured and on the (possibly ‘hidden’) 
state of the system. Theorem 6.2 below, then, rules out such theories in which 
values are sharp (i.e., dispersion-free), and Vx in (6.2) is linear. The Kochen- 
Specker Theorem subsequently proves the same impossibility under a weaker 
(and physically more reasonable) assumption called quasi-linearity. 

• Contextual deterministic theories of measurement, on the other hand, allow the 
outcome of some measurement of a to depend on the measurement context (as 
well as on the state), which in this case is understood as the choice of possible 
other (compatible) observables b measured together with a (i.e., ab — ba). This 
seems a reasonable assumption, well within the spirit of quantum mechanics, 
though perhaps not so in the extreme form later held by Heisenberg, according 
to which measurement outcomes (or even “reality”) are “created” by the mea¬ 
surement. Under a weakened non-contextuality assumption. Bell’s Theorem (cf. 
§6.5) and the Free Will Theorem (§6.2) rule out such theories, too. 

Definition 6.1. A non-contextual hidden variable is a map V : H„(C) —>■ K that 
for each a G H„(C), and in terms of the nxn unit matrix 1„, satisfies 

= y(a)2; (6.3) 

y(l«) = 1. (6.4) 

That is, V is dispersion-free as well as normalized, respectively. 
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Theorem 6.2. For n >2, non-zero linear dispersion-free maps V : H„(C) —>■ K c/o 
not exist. In particular, linear non-contextual hidden variables do not exist. 

Proof. Such maps extend to complex-linear dispersion-free maps V : M„ (C) — C 
by complex linearity, so that theorem is equivalent to Proposition 2.10. □ 

As von Neumann perfectly well understood himself, his seemingly natural linear¬ 
ity assumption (given the mathematical structure of quantum mechanics unearthed 
by none other than he!) is unwarranted physically (and even mathematically, since 
eigenvalues and eigenstates, which should be the hallmark of dispersion-free states, 
are by no means linear in the underlying operator). This suggests the following: 

Definition 6.3. A map V : H„{C) —>■ M is called quasi-linear if for all s,f € R and 
all a,b G Hn{C) that commute (i.e., ab = ba) one has 

V{sa-\-tb) = sV{a)-\-tV{b). (6.5) 

As in the linear case, such a map uniquely extends to a map V : M„{C) C that is 
precisely a quasi-state in the sense of Definition 2.26. The following lemma will be 
useful, also showing that the above objections to linearity have been met. 

Lemma 6.4. Let V : H„(C) -^M.be a quasi-linear non-contextual hidden variable. 

1. For each a G the number X = V{a) is an eigenvalue of a. 

2. If {a\,...,ak) pairwise commute, and b = f{ai,... ,aifj for some polynomial f, 
thenV{b) = fiV{ai),...,V{ak)). 

More generally, it follows from Theorem C.24 that if // is a Hilbert space and V : 
B{II)sa —M is a quasi-linear non-contextual hidden variable (or, equivalently, its 
complexification Vc : B(H) — C is a dispersion-free quasi-state), then V(a) G O'(a) 
(provided a* = a). This implies the above lemma, but we also provide a direct proof. 

Proof. For any b G II„{C) with ab = ba, eq. (6.3) and quasi-linearity imply that 

V{ab)=V{a)V{b); (6.6) 

just evaluate y((a±fo)^) = (V(a) ±V{b))'^. Taking b = a^ etc. and also invoking 
(6.4) then yields V{p{a)) = p{V(a)) for any polynomial in a. If A,- are the eigenval¬ 
ues of a, its characteristic polynomial p{a) = 115=1 {ct — ^i) satisfies p{a) = 0, so that 
V{p{a)) — 0 and hence piy (a)) = 0, or 115=1 (■^ ~ ■^<) = 0. This implies that X = Xi 
for some i. The second claim is proved in a similar way. □ 

Theorem 6.5. For n>3, quasi-linear non-contextual hidden variables do not exist. 

This is the Kochen-Specker Theorem. It follows from Gleason’s Theorem 2.28 and 
von Neumann’s Theorem 6.2, since according to Corollary 2.29 to the former, quasi¬ 
states on M„(C) are actually states (in other words, quasi-linear non-contextual hid¬ 
den variables are linear). However, Kochen and Specker also gave a direct proof of 
their theorem, subsequently somewhat simplified along the following lines. 
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Proof. We prove the claim for n = 3, which (by restricting V to any self-adjoint 
subalgebra of M„ (C) isomorphic to (C)) implies the result for all n > 3 also. To 
prove Theorem 6.5 for n = 3, we interpret // 3 (C) as the algebra of observables of a 
spin -1 particle and introduce the well-known angular momentum matrices 

/OO 0 \ / 0 0 i\ /0-i0\ 

71 = 0 0 , 72 = 0 0 0 , 73 = / 0 0 . (6.7) 

\0 i 0 J \-i00j \0 0 0 / 

In what follows, we will heavily use the squares 

/ooo\ /ioo\ 

7f = 0 1 0 , 7| = 0 0 0 , 7| = 0 1 0 , ( 6 . 8 ) 

\00 1 J \001 J \oooJ 

each of which has eigenvalues 0 and 1. The 7? commute by inspection, and satisfy 

J^+jl+4 = 2-h. (6.9) 

The (matrix-valued) angular momentum vector is given by 

1 = 7161 - 1 - 7262 + 7363 , (6.10) 

where ( 61 , 62 , 63 ) is the standard basis of (seen as a vector space with the usual 
inner product (•,•)), i.e., 61 = ( 1 , 0 , 0 ), etc., and the angular momentum 7u along an 
arbitrary unit vector u = Y.i in is given by 

3 

7u = (J,u) = ^7im,-. (6.11) 

/=1 

This brings us to the crucial point: a map V : 7/3 (C) —> K induces a map V : 5^ —>■ K 
on the set of all unit vectors u in via 

y(u) = y(72). ( 6 . 12 ) 

As usual, a basis of R^, denoted by a = (ui,U 2 ,U 3 ), is always assumed orthonormal. 

Lemma 6 . 6 . Let V : (C) a non-contextual quasi-linear hidden variable, 

with associated map V : ^ {0,1} given by (6.12). Then: 

1. y (— u) = y (u) for each u G 5^ (so that V is defined on the real projective plane); 

2. If a — (ui,U 2 ,U 3 ) is a basis, then the triple V (a) = (y (ui),y (u 2 ),y (U 3 )) must 
contain a single 0 and two 1 ’s, i.e., V (a) must be one of the triples 

= ( 0 , 1 , 1 ); 

AP) = ( 1 , 0 , 1 ); 

AP) = (1,1,0). (6.13) 
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In Gleason-like language, V is a 2 -valued frame function of weight w{V) = 2 . 

Proof. If fl = (ui,U2,U3) is a basis, then J^. = uJiu* for i = 1 , 2 , 3 , where u is the 
3 x 3 matrix with entries Uij — (u, , ey). Since u is unitary, the matrices J„. and their 
squares have the same eigenvalues and satisfy the same relations as the 7 , and their 
squares. Thus the eigenvalues of are 0 and 1 , for fixed a the squares J^. mutually 
commute, and they satisfy the sum rule ( 6 . 9 ), i.e., +-^U3 = 2 • I3, so V (ui) + 

V (U2) -|-y (U3) = 2 . The claim then follows from Definition 6.3 and Lemma 6 . 4 . □ 

Now define a coloring of as any map V :S^ ^ { 0 , 1 } satisfying the two properties 
in Lemma (6.6). The proof of Theorem 6.5 then reduces to the following lemma. 

Lemma 6.7. There exists no coloring o/K^. 

Proof. Take the following unit vectors (some identical), grouped into 11 bases (for 
simplicity we use unnormalized vectors, e.g., (1,0,1) stands for (l/-\/2,0, l/s/f)): 


basis 

Ul 

U2 

U3 

a\ 

(0,0,1) 

(1,0,0) 

(0,1,0) 

a 2 

(1,0,1) 

(-1,0,1) 

(0,1,0) 


(0,1,1) 

(0,-1,1) 

(1,0,0) 

a 4 

(1,-1,2) 

(-1,1,2) 

(1,1,0) 

as 

(1,0,2) 

(-2,0,1) 

(0,1,0) 

ae 

(2,1,1) 

(0,-1,1) 

(-2,1,1) 


(2,0,1) 

(0,1,0) 

(-1,0,2) 


(1,1,2) 

(i,-i,o) 

(-1,-1,2) 

^9 

(0,1,2) 

(1,0,0) 

(0,-2,1) 

aio 

(1,2,1) 

(-1,0,1) 

(1,-2,1) 

an 

(1,0,0) 

(0,2,1) 

(0,-1,2). 


We will show that one cannot even color this particular finite set of vectors (let alone 
all unit vectors in R^). We denote a vector u,- in a basis by 

u|^^^ = l,2,3,Al = l,...,ll, 

and write e.g. V (a^) = (0,1,1) for the three conditions 

y(u['^^) = o, y(u('‘^) = i), y(4^)) = L 

The main point is that if some coloring V maps a specific vector u to 0 , then all 
vectors orthogonal to u must go to 1 . In particular, two orthogonal vectors can never 
both be sent to 0 . To find a contradiction (to the assumption that V exists), we try 
to assign values y(u|^^) one after the other, starting in row 1 . Here some specific 
choices will be made, but by symmetry other choices lead to similar contradictions. 
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1. Suppose that V{ai) = (0,1,1) (i.e., y(uW) = 0 and y(u 2 ^^) = = 1). In 

02 this forces V (U 3 ') = 1, so that either u) ^ or U 2 must be mapped to 0 (and 
the other to 1). Let V = 0, so that V = 1, i.e., V (U 2 ) = (0,1,1). In 03 
one has so V (uj^^) = I. We choose V (u^^) = 0 and hence V = 

I, so y(u 2 ) = (0,1,1). In 04 , the vector is orthogonal to which has 
been mapped to zero already, so that V ) = 1 • The remaining free choice is 
arbitrarily made as V (u) = 0, so that V ) = 1 and hence y ( 04 ) = (0,1,1). 

2. But now everything is fixed for 05 t/m an, as follows. From 05, the vector 
already occurred in ui, and moreover, is orthogonal to from 04 . Be¬ 
cause y (u|^^) = 0, one must have V (uj"*^) = 1. And so on and so forth, yielding 
y (ofx) = (0,1,1) voor /r = 5,..., 10 (as was the case also for /r = 1,2,3,4). 

3. In an one has so is mapped to 1. Furthermore, is or¬ 

thogonal to Uj^^, which was mapped to 0; hence goes to 1. Finally, is 
orthogonal to which was mapped to 0, so that must go to 1. Thus 

y(an) = (1,1,1). (6.14) 

But (1,1,1) is not an admissible value of y! So y and hence V cannot exist. □ 

Corollary 6.8. There is no function V with the two properties stoted in Lemmo 6.6. 

The Kochen-Specker Theorem is often stated in the following way. 

Definition 6.9. For ony finite-dimensionol Hilbert spoce H, o coloring of the set 
(H) of one-dimensionol projections on H is o function 

W : ^i{H) ^ {0,1} 

such thotfor ony resolution of the identity (e, ) with e,- G (H), i.e., 

eiej = dijef, (6.15) 

Y.ei = 1h, (6.16) 

i 

one hos 

= 1, (6.17) 

i 

so thot there is exoctly one member e,- of the fomily such thot W (e,) = 1. 

Note that if e G then e = e^, = |V7)(V^| for some unit vector y/ G //, so 

that each basis (n,) of H defines such a family by e,- = |i),)(l),|, and vice verso, 
up to phase factors. The setting of Gleason’s Theorem is similar, with the crucial 
difference that the function on 1^\ (H) in question then takes values in [0,1] instead 
of { 0 , 1 } and hence can be shown to exist, even amply so (as there are many states). 
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Theorem 6.10. > 2, there exists no coloring of (H). 

Proof. For H = C^, the existence of W would yield the existence of V through 

y(u) = l-lF(eu), (6.18) 

where u G is regarded as a vector in C^. Property 1 in Lemma 6.6 is obviously 
satisfied. To prove property 2, we note that for any unit vector u G , we have 

7uU = 0, (6.19) 

since an explicit computation based on (6.11) shows that, with u = (ui,U 2 ,u$), 

( M 2 +M 3 —U\U2 —M 1 M 3 \ 

— MlM2 Mj+Mj —M 2 M 3 . (6.20) 

— M1M3 —M2M3 MJ+M2/ 

It follows from rotation invariance that the eigenvalues of are the same as those 
of each jf, cf. (6.8), i.e., X = 0 with multiplicity one and X = 1 with multiplicity 
two. Hence (6.19) gives the projection eq onto the eigenspace of for A = 0 as 

eo = |u)(u| =eu. (6.21) 

Property 2 in Lemma 6.6 then follows from the assumption that VL is a coloring. 
Since V cannot exist by Lemma 6.7, neither can W. This proves the claim for C^. 

We finish by induction. Suppose C” contains some set {ak}keK of unit vectors 
that cannot be colored, assuming that uq = (1,0,...,0) lies in this set. We embed 
each Ui into by adding a zero at the end, calling the image u(.. Adding v = 
(0, ..., 0 ,1 ), the only possible coloring of the set in C"+* is given by 

W (u(,) = 0 for each k € K and W(v) = 1. Indeed, if W (u(.^) = I for some ko, then, 
since v is orthogonal to each u(,, we must have W (v) = 0, which means that the 
original set {uk}keK should be colorable in C”, but this is impossible by assumption. 

We now embed each ui, into C"+* by adding a zero at the beginning, denoting its 
image by u", and add Uq = (1,0,... ,0,0). By the same token, the only coloring of 
the set {u",UQ}^gA: is given by W{u'[) = 0 for each k G K and W(uq) = 1. But this 
leaves the set {u'i^,u",\}keK in C"+' uncolorable, since colorability of {u'i^,\}keK 
gave W(uq) = 0 , whereas colorability of {u",UQ}A;gA: gave W(uq) = 1 . □ 

The set thus obtained is larger than necessary. For example, already for H = C^the 
following bases cannot be colored (again writing down unnormalized vectors): 
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basis 

Ui 

U2 

U3 

U4 

a\ 

(0,0,0,1) 

(0,0,1,0) 

(1,1,0,0) 

(1,-1,0,0) 

a2 

(0,0,0,1) 

(0,1,0,0) 

(1,0,1,0) 

(1,0,-1,0) 


(1,-1,1,-1) 

(1,-1,-1,1) 

(1,1,0,0) 

(0,0,1,1) 

a4 

(1,-1,1,-1) 

(1,1,1,1) 

(1,0,-1,0) 

(0,1,0,-1) 

as 

(0,0,1,0) 

(0,1,0,0) 

(1,0,0,1) 

(1,0,0,-1) 


(1,-1,-1,1) 

(1,1,1,1) 

(1,0,0,-1) 

(0,1,-1,0) 

aj 

(i,i,-i,i) 

(1,1,1,-1) 

(1,-1,0,0) 

(0,0,1,1) 


(i,i,-i,i) 

(-1,1,1,!) 

(1,0,1,0) 

(0,1,0,-1) 

ag 

(1,1,1,-1) 

(-1,1,1,!) 

(1,0,0,1) 

(0,1,-1,0) 


The proof is the following observation: if we present the coloring condition as 

1T(0,0,0,1)+W(0,0,1,0)+1T(1,1,0,0)+W(1,-1,0,0) = 1; (ai) 

(fl.) 

1T(1,1,1,-1)+1T(-1,1,1,1)+1T(1,0,0,1)+1T(0,1,-1,0) = 1, ( 09 ) 

then since there are nine such equations the sum of the right-hand sides is odd, 
whereas the sum of the left-hand sides is even, since each vector appears twice. 

To bridge the gap between the Kochen-Specker Theorem and the Free Will The¬ 
orem, as well as the one between mathematics and physics, we now rephrase the 
former as a “mini FWT”. We build an experiment consisting of a box containing a 
spin-1 particle and a device capable of measuring all of the three observables 


V'^Ui :>'^U2 ’'^U3 ) 


for an arbitrary basis a of since the operators in question commute, this si¬ 
multaneous measurement is allowed by quantum theory. The choice of a is called 
the setting of the experiment, traditionally denoted by A (in honor of Alice, who is 
supposed to perform the experiment), with possible values A = a. In “phenomeno¬ 
logical” notation, the observable measured in an experiment like this is called F, 
which in the case at hand has three components F = {Fi,F 2 ,F-}): given the setting a, 
the observable Fi corresponds to The notation F = A for A = {Xi,X 2 ,X 2 ,), i.e., 
Fi = Xi, then expresses the fact that the outcome of a measurement of T" is A. 

According to both quantum mechanics and our quasi-linear non-contextual hid¬ 
den variable theory, either Xi = 0 or A, = 1, and X must lie in the value space 

A ={(0,1,1),(1,0,1),(1,1,0)}; (6.22) 


cf. Lemma 6.6 for the hidden variable theory, while in quantum mechanics (6.22) 
follows from the fact that X must lie in the joint spectrum of the three operators J^.. 
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This, in turn means that there must be a joint eigenvector xj/ such that = XiXj/ 
for each i = 1,2,3. There are three such joint eigenvectors, namely ui, U 2 , and 
U 3 (initially defined as vectors in but now seen as vectors in C^), with joint 
eigenvalues (0,1,1), (1,0,1), and (1,1,0), respectively. 

Otherwise, quantum mechanics and our quasi-linear non-contextual hidden vari¬ 
able theory provide a different picture of the experiment. According to the former 
theory, a given spin-1 particle may be prepared in a (pure) quantum state \j/, which 
is a unit vector in C^. Quantum theory then merely predicts probabilities 

Py,{F = X\A = a) = {h,X 2 .,X-i), (6.23) 

for the possible outcomes X, which according to the Born rule (2.21) are given by 

P^{F = \A = a) = |(u;, Vf)p. (6.24) 

So if y/ = u,, then the outcome will he X = X with probability one, but in a super¬ 
position xj/ = QUi (with Y.i |C( P = 1), quantum theory predicts a random sequence 
of outcomes X^‘\ each with probability |c, p. 

Let us note that quantum mechanics is non-contextual in the following (proba¬ 
bilistic) sense. Alice could decide to perform just one measurement instead of three, 
say Fi, with setting ai = ui, or perhaps she may not know if the other two are 
performed. Fortunately, this does not matter, since for any unit vector xj/ G C^, 

P^{Fi=Xi\Ai=ui)= ^ P^(F = A|A = fl), (6.25) 

so that according to quantum mechanics, it does not matter for the Born probabilities 
of the first measurement if the other two are performed or not. 

The question now arises if some quasi-linear non-contextual hidden variable the¬ 
ory theory could improve on this, in that the probabilities quantum theory assigns 
to various outcomes are replaced by predictions. In the sprit of determinism (whilst 
avoiding the appearance of circularity), such a theory should also predict the settings 
of the experiment. Accordingly, the assumptions leading to our “mini FWT” are: 

Definition 6.11. In the context of the experiment on spin-1 particles just discussed: 

• Determinism firstly means that there is a state space X with associated functions 

A:X^Xa-, (6.26) 

F:X^A, (6.27) 

where Xa is the set of all bases in (i.e. a G Xa), and A is some set of possible 
outcomes; these functions completely describe the experiment in the sense that 
each state x GX determines both its settings a =A{x) and its outcome X = F{x). 
Here A = (Ai ,A 2 ,A 3 ), where the functions Ai :X ^ (seen as the space of unit 
vectors in R^) combine to define a basis, and F = (T’l ,^ 2 )^ 3 ). where 7): 2f —>■ R. 
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Secondly, there exists some set Xz and an additional function 


Z-.X-gXz, 

(6.28) 

such that 


F = F{A,Z). 

(6.29) 

More precisely, for each xGX one has 


F{x)=F(A{x),Z{x)) 

(6.30) 


for a certain function F : x Xz ^ A. Also this function is, of course, a triple 

F = {F\,F 2 ,F},), where Fi'.Xj^xXz^ 2. In terms of (6.28), then: 

• Nature then requires that A is given by (6.22) (so that Ft :X ^ 2). 

• Freedom states that A and Z are independent in the sense that the function 

AxZ-.X ^XaxXz 

X I—>■ (A(x),Z(x)) (6.31) 

is surjective; in other words, for each {a,z) GXa x Xz there is anx GX for which 
A(x) = a andZ{x) = z (making a andzfree variables). 

• Non-contextuality (cf. Lemma 6.6) finally stipulates that F take the form 

F((ui,U2,U3),z) = (F'(ui,z),.F(u2,z),.F(U3,z)), (6.32) 

for a single function F : xXz ^2 that also satisfies 

F(-u,z)=F{u,z). (6.33) 

“Nature” may be taken to be either an experimental result or an uncontroversial 
prediction of (some corner of) quantum mechanics. The function Z (including its 
domain Xz) describes anything relevant to the experiment (such as the behaviour of 
the particle) except the variables determining the settings (which do form part of 
X). The goal of the freedom assumption is to remove any potential dependencies 
between the variables (a,z), and hence between the physical system Alice perform 
her measurements on, and the devices she performs her measurements with. 

Corollary 6.12. Determinism, Nature, Freedom, and Non-contextuality are contra¬ 
dictory. 

Proof. For each z G Xz, define a function ^ 2hy (^(u) = F{u,z). The as¬ 

sumptions combine to give the same properties as V in Lemma 6.6 (where z 
“goes along for a free ride”). According to Corollary 6.8 (which applies because by 
Freedom one can freely vary a for any given z), the function cannot exist. □ 

This “mini fwt” is a good exercise for the Free Will Theorem in the next section. 
For example, let us note, as a warning, that if Determinism is seen as the culprit (and 
hence falls), then the other assumptions in the (min) FWT are no longer defined. This 
blocks a direct inference from Freedom to Indeterminism a la Conway & Kochen. 
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6.2 The Free Will Theorem 

The Free Will Theorem is similar in spirit to Corollary 6.12, with the difference that 
the experiment now has two wings and the non-contextuality assumption is replaced 
by a certain locality condition. This condition relates to the setting introduced by 
Einstein, Podolsky, and Rosen in 1935 and further studied by Bohm, Bell, and oth¬ 
ers, in which (in current jargon) two physicists, called Alice and Bob, are far apart 
whilst performing simultaneous experiments on some correlated two-particle state 
(technically speaking, their measurements need to be spacelike separated). In the 
situation considered by EPR each particle had a spatial degree of freedom and hence 
required the infinite-dimensional Hilbert space for its description, but, as 

recognized by Bohm, the thrust of the argument comes out more clearly if each 
particle merely has an internal degree of freedom (and is “frozen” otherwise). 

Bell (1964) considered a pair of spin \ particles (cf. §6.5), each of which has 
Hilbert space (although the famous experiments of Aspect testing the violation of 
Bell’s inequalities used photons, which have the “same” Hilbert space), but because 
of its reliance on the Kochen-Specker Theorem (which fails for C^) the Free Will 
Theorem requires one dimension more, i.e., H = C^. As before, we see this as the 
state space of a massive spin-1 particle. The price of this extra dimension is that 
the pertinent experiment whose outcome provides the Nature input for the Free Will 
Theorem has not actually been performed, but, as in the Bell case, the predictions 
of quantum mechanics are uncontroversial and will serve as input instead. 

These predictions are as follows. Alice and Bob measure on the correlated state 

V^o = (ei 061-l-e 2 ( 8 )e 2 + e3(g)e3)/'y3, (6.34) 

where we recall that ( 61 , 62 , 63 ) is the standard basis of now seen as a basis of 
C^. This state is rotation-invariant, which means that nonzero angular momentum in 
one particle must be compensated for in the other, creating the desired correlations. 

As before, we denote Alice’s setting by A = a, which remains the choice of some 
basis of but this time also Bob picks some basis b, so that we write B — b for his 
choice. Similar to Alice’s outcome F = X we denote Bob’s by G = 7 , and quantum 
mechanics provides all (Born) probabilities 

P^^{F = X,G = Y\A = a,B = b)=pj2^j2^j2^ j2^ j2^ j2^ (Ai,A 2 ,A 3 , 7 i, 72 ,73), 

which are well defined because Alice’s squared angular momentum operators 7^^ 
commute with Bob’s as a consequence of Einstein locality (stating that spacelike 
separated observables commute). Note that similarly to a = (ui,U 2 ,U 3 ) for Alice’s 
basis, we write b = (vi,V 2 ,V 3 ) for Bob’s. If Alice merely measures 7) whilst Bob 
measures Gj, then, as in the previous section, it does not matter which other (com¬ 
muting) operators are measured and/or whether Alice and Bob know about this, cf. 
(6.25). Thus we may write either (A = a,B = b) or A, = = v, for the settings, 

and simple calculations show that the Bom probabilities are given by: 
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P^^{Fi = \,Gj = l\A = a,B^b) = i(l + (u,-,v,-)2); (6.35) 

P^,{Fi = 0,Gj = 0|A = a,B = b) = i(u,-,v,)2; (6.36) 

P^^{Fi = \,Gj = Q\A = a,B = b) = - (u,-, v,)^); (6.37) 

P^^{Fi = Q,Gj = \\A = a,B = b) = i(l - (u,-, v,)^), (6.38) 


where (u,-,V;)^ = |(u,',Vj)p, etc., since the vectors are real, In terms of the notation 


■f’vo 

(Fi 

= Gj\-) 

— {Fi 

= 0,Gj = 

0\-)+P^,{Fi = 

hGj = i\-y, 

(6.39) 

^¥0 

(Fi 


= P\fio {Fi 

= 0,Gj = 

l\-)+P.^,{Fi = 

l,Gj = 0\-), 

(6.40) 

this yields 










PxjiQ {Fi 

= Gj\A = 

a,B = b) 

= f(l-F2(u,',v 


(6.41) 



Fwo {Fi 

II 

0 

ik 

a,B — b) 

= 1(1 -(uoV;-; 


(6.42) 


The crucial point for the Free Will Theorem is that this implies perfect correlation: 


P^^{Fi = Gj\Ai = Bj) = l, (6.43) 

in agreement with the intuition about angular momentum expressed earlier. 

We now move to a (possibly counterfactual) deterministic description of this ex¬ 
periment along the lines of the previous section. It is straightforward to adapt all 
of Definition 6.11 except Non-contextuality (which after all is the assumption we 
would like to get rid of!). With the obvious changes, we obtain; 


Determinism again ^rsf claims there 

is 

a state space X with 

associated functions 

A : 

X 

2 ( 4 ; 

(6.44) 

B : 

X 


(6.45) 

F : 

X 

-5^ A; 

(6.46) 

G : 

X 

—A, 

(6.47) 


where X/^ = Xg is the set of all bases in and A is some set of possible 
outcomes, which completely describe the experiment in the sense that each 
state X & X determines both its settings {a = A{x),b — B{x)) and its outcome 
{X — F{x),Y— G(x)). Here a = (Ai,A 2 ,A 3 ) andZ? = {B\,B 2 ,B^) where the func¬ 
tions Ai :X ^ (where is seen as the space of unit vectors in R^) combine 
to define a basis (similarly for Bj :X ^ S^), and F = (Fi ,^ 2 ,^ 3 ). Secondly, there 
exists some set Xz and an additional function Z:X -^Xz such that 

F=F{A,B,Z); (6.48) 

G = G{A,B,Z), (6.49) 

in that for each x&X one has the functional relationships 
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F{x)=F{A{x),B{x),Z{x)y, (6.50) 

G{x) = G{A{x),B{x),Z{x)), (6.51) 

for certain functions F : x Xg x Xz ^ A and G : X^ x Xb x Xz ^ A, each 

of which is a triple F = {F\,F 2 ,F^) with Ft : Xa x Xg x Xz ^ R, etc. The value 
z = Z{x) is just the traditional “hidden variable” (which is often denoted by X). 

• Freedom then states that A, B, and Z are independent in that for each {a,b,z) G 
Xa xXb X Xz there is an x € X for which A (x) = a, B(x) = b, and Z(x) = z. 

• Nature requires that; 

- A is given by (6.22), i.e. Ft and Gj, and hence Fi and Gj take values in {0,1}; 

- The experiment measures squares of angular momenta, so that 

F{a',b',z) = F{a,b,zy, (6.52) 

G{a',b',z) = G{a,b,z), (6.53) 

whenever {a',b') differ from {a,b) by changing the sign of any basis vector; 

- Perfect correlation obtains, cf. (6.43), i.e., writing a = (ui,U 2 ,U 3 ) for Alice’s 
basis and b = (vi,V 2 ,V 3 ) for Bob’s, one has 

u,' = \j ^ Fi{a,b,z) = Gj{a,b,z). (6.54) 

We now come to the locality condition that is to replace Non-contextuality. This 
condition was first clearly stated by Bell (1964, p. 196), who attributes it to Einstein; 

‘The vital assumption is that the result G for particle 2 does not depend on the setting a of 
the magnet for particle 1, nor F onb.' 

Noting various other notions of locality (such as Einstein locality in local quantum 
physics, which requires spacelike separated operators to commute, or Bell locality, 
discussed below), the above idea might be called Context locality, but we will simply 
refer to it as Locality. In our deterministic setting, a precise formulation is this; 

• Locality means that F{A,B,Z) is independent of B and G{A,B,Z) is independent 
of A. In other words, we have F = F{A,Z) and G = G{B,Z), so that (with slight 
abuse of notation) F :XaxXz^ A and G: Xb x Xz ^ A, or, then again, F{x) = 
F{A{x),Z{x)) and G(x) = G(B(x),Z(x)), for each x G X. 

This finally brings us to (our reformulation of) the Free Will Theorem: 

Theorem 6.13. Determinism, Freedom, Nature, and Locality are contradictory. 

Proof. The Freedom assumption allows us to treat {a,b,z) as free variables, a 
fact that will tacitly be used all the time. First, taking i = j in (6.54) shows that 
.E(ui,U 2 ,U 3 ,z) only depends on (u,,z), whilst G;(vi,V 2 ,V 3 ,z) only depends on 
(vj,z). Hence we write Fi{a,z) = Fi{\ii,z), etc. Next, taking i f j in (6.54) shows 
that Fi (u,z) = T 2 (u,z) = Fq,{u,z). Consequently, the function F : Xa x Xz ^ Xp is 
given by (6.32). We are now back to the proof of Corollary 6.12, concluding that 
such a function does not exist by Corollary 6.8. □ 
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6.3 Philosophical intermezzo: Free will in the Free Will Theorem 


‘The determinism-free will controversy has all of the earmarks of a dead problem. The 
positions are well staked out and the opponents manning them stare at each other in mutual 
incomprehension.’ (Earman, 1986, p. 235) 

The question arises which specific notion of free will is among the assumptions of 
the FWT (in the reformulation just given). To put this question in perspective, let 
us briefly recall the main point of the debate about free will. This concept has two 
poles. One is the “will” itself, requiring a sense of agency, deliberation, and control. 
This pole seems to require some form of determinism. A powerful expressions is: 

‘Fiirst! Was Sie sind, sind Sic durch Zufall und Geburt. Was ich bin, bin ich durch mich.’* 
(Beethoven, to his benefactor (!) Prince Lichnowsky) 

The other pole of free will is the adjective “free”, i.e., the ability to do otherwise, 
which at first sight requires indeterminism. The problem of free will is that these 
poles seem contradictory. Many authors conflate free will with moral responsibility: 

‘free will can be defined as the unique ability of persons to exercise control over their 
conduct in the manner necessary for moral responsibility.’ (McKenna & Coates, 2015) 

This aspect is irrelevant to our discussion, concerned as it is with the question what 
it would mean for Alice and Bob to choose their settings “freely” if determinism is 
assumed (it would have been different if one setting launched a nuclear missile). 
Even in our narrow context, the traditional philosophical stances are relevant: 

• Compatibilism denies the contradiction, claiming that free will and determinism 
coexist. This position may be defended in many ways, among which one finds: 

- Reconceptualizing “the ability to do otherwise” in a deterministic world. This 
will be our focus in what follows, especially in a version inspired by Lewis. 

- Belittling the relevance of “the ability to do otherwise”, as e.g. by Dennett: 

‘So if anyone at all is interested in the question of whether one could have done 
otherwise in exactly the same circumstances (and internal state) this will have to be 
a particularly pure metaphysical curiosity—that is to say, a curiosity so pure as to be 
utterly lacking in any ulterior motive, since the answer could not conceivably make 
any noticeable difference to the way the world went.’ (Dennett, 1984, p. 559). 

• Incompatibilism accepts the contradiction, once again branching off into: 

- Libertarianism, arguing that free will requires an indeterministic world. 

- Hard determinism, claiming determinism (which is assumed) blocks free will: 

‘Ein Mensch kann zwar tun was er will, aber nicht wollen was er will.’^ 

(Schopenhauer) 

- Hard incompatibilism, asserting that ‘every way you look at it you lose’: 
free will makes no sense in either a deterministic or an indeterministic world. 

• ‘Lord! What you are, you are through chance and birth. What I am, I am because of myself.’ 

^ ‘One can admittedly do what one wants, but one cannot want what one wants.’ 
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Although hard incompatibilism has our sympathy, our opening question con¬ 
cerning the notion of free will in the FWT drives us into the compatibilist direction, 
since determinism is among the assumptions shown to be contradictory by Theo¬ 
rem 6.13. Within compatibilism, we will be close to the well-known ‘local miracle’ 
variant thereof proposed by the philosopher David Lewis. Like other compatibilists 
before him (starting at least with G.E. Moore), Lewis attempts to make sense of the 
intuition that even in a deterministic world one in principle has the ability to act 
differently from the way one actually does, despite the fact that the latter was pre¬ 
determined. A simple example is Alice’s choosing setting a by moving her hand in 
a certain way, although she was able to choose a'. On the other hand, she could not 
have moved her hand with a speed greater than that of light, so her ability remains 
constrained by the laws of nature. Lewis asks us to distinguish between: 

• ‘I am able to do something such that, if I did it, a law would be broken.’ 

• ‘I am able to break a law.’ 

The latter is impossible, but the former is not on Lewis’s own theory of counterfac- 
tuals, according to which the phrase ‘if I did it’ leads us to consider the possible 
world in which doing ‘something’ is actually true, whilst in the possible worlds 
under consideration as many other features as possible are kept the same as in the 
actual world (the precise underlying measure of similarity is not important here). 
Thus the phrase ‘a law would be broken’ refers to the laws of the actual world (in 
which the alternative action is not realized). It seems to be of great importance to 
Lewis that in the hrst case it is not the agent who would break a law; instead, it is the 
breaking of some law of our actual world at an earlier time that enables the subject 
to do in an alternative possible world what she could not do in our actual world, . 

By making this distinction, Lewis claims that he invalidates the seemingly lethal 
Consequence Argument against compatibilist free will, of which a simple version 
reads (assuming determinism, on which compatibilist free will is predicated): 

1. Alice’s actions are a necessary consequence of the laws of nature plus the state 
of the universe (or the relevant part thereof) at any earlier time; 

2. Alice is unable to render both (laws and earlier states) false; 

3. Alice is unable to render the consequences of laws and earlier states false; 

4. Ergo: Alice is unable to do otherwise than what she actually does. 

Lewis claims that statement 3 is ambiguous, in that it fails to distinguish between the 
two senses in his two bullet points above. The Consequence Argument requires the 
latter (which is false), whereas this argument itself is unsound on the former (which 
is true). This disambiguation of assumption 3 in the Consequence Argument, then, 
is supposed to save (compatibilist) free will. However, a considerable philosophical 
literature suggests that the tension between Lewis’s denying the second bullet point 
whilst accepting the hrst is pretty uncomfortable, rehecting the corresponding ten¬ 
sion between the conjunction of determinism and freedom in general; indeed, this is 
what the FWT makes precise! Let us hrst point out that, at least in his terminology 
Lewis fails to make a clear distinction between laws of nature and initial states: 
from the point of view of modern physics, this distinction is absolutely fundamental 
(although it may dispappear in post-modern physics based in e.g. quantum gravity). 
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Lewis’s examples of law-breaking events in our actual world typically refer to 
violations of some law of nature (like exceeding the speed of light), whereas the (al¬ 
leged) law-breaking in his counterfactuals, such as choosing a' (where in fact Alice 
did not do so) amounts to a change in some earlier state. Thus it might have been 
more appropriate if the paper in which Lewis laid out his version of compatibalism 
had been entitled Are we free to change the states? instead of Are we free to break 
the laws?. On this revision, his distinction of the two cases takes the following form: 

• I am able to do something such that, if I did it, the state of the actual world at 
some earlier time would have been different. 

• I am able to change the actual state of the world. 

The latter remains impossible, while it is the former that enables free will. Applied 
to Alice, the former should mean (still in the compatibilist spirit of Lewis): 

• A slight alteration in the state of the actual world (which would have made it a 
different but very similar world according to Lewis) would have led Alice to do 
something (such as choosing a') that she did not do in the actual world (because 
according to determinism its actual state at any earlier time—as opposed to the 
counterfactual alternative state in the discussion—led her to choose a). 

We now make this revised version of Lewis’s local miracle compatibilism math¬ 
ematically precise, in a way that has the additional advantage of involving not only 
“the ability do do otherwise’’, but also the other component free will, i.e. agency. 
Here the intuition is that free will involves a separation between the agent, Alice, 
(who is to exercise it) and the rest of the world, under whose influence she acts. 
Namely, as in the FWT, let X be the state space of the Universe, and let 

a=A{x) (6.55) 

again be Alice’s setting, where A : X ^ Xa, as before. We now assume that a is 
determined by her “inner state’’ I as well as the “outer state’’ O of the rest of the 
world, under whose influence she acts. These, in turn, are determined by the state 
X GX of the world. That is, A =A{0,I), which expresses the existence of functions 


0\X^Xo\ 

(6.56) 

I-.X^Xf, 

(6.57) 

A:XoxXi^Xa, 

(6.58) 

where Xq and Xj are certain sets, such that for each x GX one has 

A{x) = A{0{x),I{x)). 

(6.59) 

In other words, for some given state x of the world we have 

0 = (9(x); 

(6.60) 

i = /(x); 

(6.61) 

a = A(of). 

(6.62) 
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Note that, in the spirit of Conway and Kochen, in the above analysis Alice (whose 
free choice they after all believe to be ultimately a consequence of the free choice 
of elementary particles) now plays the role of the spin-1 particles in the bipartite 
experiment. Thus the analogy is between the triples: 

(a,z,A) € Aa xZxA; (6.63) 

{o,i,a) € Xo X Xi X Xa- (6.64) 

• The first triple is defined in the experimental context of the FWT, where a is the 
setting of Alice’s wing of the experiment (which from the perspective of the spin- 
1 particle plays the role of the outer state of the world), z is the inner state of the 
particle, and X is the outcome of Alice’s measurement. 

• The second pertains to the analysis of Alice’s “free” choice of the setting of her 
experiment, where o is the outer state of the world, i is her inner state, and a is 
her actual setting, givenx G X and hence (o,/) = {0{x),I{x)). 

Beyond Determinism, which is expressed by the above framework, our funda¬ 
mental assumption underpinning compatibilist free will is Freedom, defined exactly 
as in the FWT: O and I are independent in that the following function is surjective: 

OxI-.X ^XqxXi 

X I—>■ {0{x),I{x)), (6.65) 

i.e., for each pair (o, /) GXj xXq there is x G A for which (6.60) and (6.61) hold. 

Rephrasing our earlier analysis in this elementary mathematical language, Lewis 
wants to make sense of the idea that although Alice’s choice (6.62) at some fixed 
time t was determined by the state x of the Universe at that time through (6.60) - 
(6.61), or, equivalently, through (6.59), and hence—and this is the whole point of the 
Consequence Argument Lewis challenges—by any earlier state Xp of the Universe 
at time tp, nonetheless Alice was “able to act otherwise” at time t, e.g. in choosing 

a'=A{o',i'), (6.66) 

but did not do so, since choosing a' would illegally have changed the state x to x' 
(both at time f), and, equivalently (given determinism), would have changed Xp to 
x'p. On our reading of Lewis’s theory of counterfactuals, Alice’s ability to choose a' 
simply means that there exists a state x' of the world close to x in the sense that 

O(x') = 0{x) = o, (6.67) 

making the environment in which Alice acts the same as in the actual world, but 

i' = I{x') ^ I{x) = i, (6.68) 

where i' should be close to i in some appropriate sense (such as a slight change in 
the state of Alice’s brain), such that (6.66) holds, with o' = o as required by (6.67). 
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The point, then, is that according to our Freedom assumption, there indeed is such 
a nearby state x', for any given i' and (o, i). Thus the freedom Alice has is precisely 
what we have formalized as Freedom: even given the state o of the causal influences 
on her behaviour (and possibly even the entire state of the rest of the world), there is 
a different admissible state x' of the world such that, had this state been actual, she 
would have chosen a' (although she in fact, necessarily, picked a). 

It should be clear now that at least in the context of the Free Will Theorem, our 
precise technical formulation of all assumptions implies that the freedom Alice and 
Bob have in choosing their settings is an instance of the local miracle compatibilist 
form of free will proposed by Lewis (1981), at least if one accepts our reformulation 
thereof. The theorem then establishes a contradiction between: 

• the physics assumptions, i.e.. Nature, and Locality, 

• the compatibilist free will assumption, i.e.. Determinism and Freedom. 

Accepting the former, the latter must fall. Making this choice, one should realize that 
the physics assumptions on the one hand just form a small corner of modern physics 
(from which point of view they are weak), but on the other hand have singled out 
the corner in which the two fundamental theories of quantum mechanics and special 
relativity meet and are brought to a head (from which perspective they are strong). 

The challenge their theorem puts to compatibalism was recognized by Conway 
& Kochen (2009), who write: 

‘The tension between human free will and physical determinism has a long history. Long 
ago, Lucretius made his otherwise deterministic particles swerve unpredictably to allow 
for free will. It was largely the great success of deterministic classical physics that led to 
the adoption of determinism by so many philosophers and scientists, particularly those in 
fields remote from current physics. (This remark also applies to “compatibilism”, a now 
unnecessary attempt to allow for human free will in a deterministic world.)’ 

This quotation does not use a precise version of compatibilism, but, as Conway 
explains elsewhere, what they mean is that compatibilism in whatever form was 
a desperate pre-twentieth-century attempt to save the notion of free will for e.g. 
Christianity in the face of the physics of the time, which assumed that the universe 
was a mechanical clockwork. Such attempts, then, would no longer be necessary 
if the world is, in fact, indeterministic (as Conway and Kochen claim to have at 
last proved). Our reformulation of their theorem (which removes the threat of cir¬ 
cularity) gives a more subtle picture: the FWT uses modern physics to challenge 
one particular version of compatibilist free will. As such, it only provides indirect 
support for libertarian free will, namely by weakening one of its competitors. 

To close this philosophical intermezzo, let us note that determinism is seen as 
a property of theories. Since it is the job of a deterministic theory to predict the 
outcome of any experiment, whether or not it is performed, this obviates the need for 
assumptions like counterfactuality in the sense that ‘unperformed experiments have 
results’ (which was famously denied by Asher Peres). Such controversial notions of 
counterfactuality have effectively been replaced by the considerably more refined 
modal counterfactuality of Lewis (at least in our slight reformulation thereof). 
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6.4 Technical intermezzo: The GHZ-Theorem 

The essence of the proof of the Free Will Theorem lies in the argument that per¬ 
fect correlation together with context-locality implies non-contextuality. Remark¬ 
ably, context-locality is at the same time a special case of non-contextuality, as the 
following example illustrates. We take H = <C^ ® C^, equipped with the Bell basis 


110 = (|01)-|10))/V2; 

(6.69) 

111 = (|01) + |10))/V2; 

(6.70) 

112 = (|00)-|11))/V2; 

(6.71) 

ii3 = (|00) + |11))/V2, 

(6.72) 

where we use the physicists’ notation 

|1) = (1,0); 

(6.73) 

|0) = (0,1); 

(6.74) 

Ih) = l*)0|7)- 

(6.75) 


Of course, contains the spin-1 Hilbert space of the Kochen- 

Specker Theorem as the subspace orthogonal to the vector Dq. Thus we identify 
with the subspace of spanned by the basis vectors 111 , 1 ) 2 , 113 . The operators 

/u = 012 + 12® C7u), (6.76) 

where u C is a unit vector as before, and 

3 

(Ju = ^ a'ui (6.77) 

/=1 

in terms of the Pauli matrices a\ map i)i to zero and leave its orthogonal comple¬ 
ment stable. Elementary group theory or direct calculation then shows that the 
operator on in (6.11) is (unitarily) equivalent to the operator Ti, on C^. Since 

4 = 5 ((^uOffu +I 2 O I 2 ), (6.78) 

the Kochen-Specker argument can be rephrased in terms of the operators ffu O O'u- 
In particular, for each frame a = (ui,U 2 ,U 3 ), the three operators 

(cJuj O CJu^, c7u2 O ^u2 ; ^U3 O cru3) (6.79) 

commute, they each square to one, and their joint eigenvalues are one of the triples: 

(- 1 ,- 1 ,- 1 ),(- 1 , 1 , 1 ),( 1 ,- 1 , 1 ),( 1 , 1 ,- 1 ). 
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The eigenvector corresponding to the first one is Vq, and hence the others must lie 
in C^. Hence by Lemma 6.4 any quasi-linear non-contextual hidden variable must 
also assign these values, which by Lemma 6.7 is impossible for arbitrary bases. 

The key mathematical property of the three operators (6.79) is that they commute, 
and together with the unit I 2 0 I 2 form a maximal set of commuting self-adjoint 
matrices on C^. But other such sets could have been chosen by Alice (under whose 
sole control the situation so far has been assumed to be), such as a triple of the kind 

( CTu 0 12 5 12 ^ tTy, Oil 0 Oy ), 

where u and v are arbitrary unit vectors in Since the third operator is the product 
of the first two, the joint eigenvalues of this triple, and hence also the assignments 
by a quasi-linear non-contextual hidden variable, must be one of the four triples 

(1,1,1),(-1,1,-1),(1,-1,-1),(-1,-1,1). 

The non-contextuality assumption would then dictate that the outcome of Alice’s 
measurement of Ou 0 I 2 be independent of her choice of the setting v in a possible 
simultaneous measurement of I 2 0 <7y, and vice versa. Therefore, in a (non-local) 
bipartite setting where Alice is only able to measure operators of the type a 0 I 2 , 
whilst Bob can measure \2 0b, on the above choice of (commuting) operators, 
non-contextuality in the situation where Alice controls everything is mathematically 
equivalent to (context) locality in the bipartite Alice & Bob setting. 

Further constraints then arise if the system is prepared in a correlated state like 
1 / 0 , which is an eigenstate of ffu 0 <7y with eigenvalue — 1 whenever u = v. So in that 
case the values of (du 012 , 12 ® dy) can only be (1, — 1) or (—1,1), yielding perfect 
anti-correlation. This is not enough, however, to derive a Free Will Theorem; to do 
so with the small single-site Hilbert space C^, one needs a third (non-local) party. 

Indeed, the well-known tripartite GHZ-argument may be rephrased as a Free Will 
Theorem, as follows. The underlying Hilbert space is 

H = C^0C^0C^^C^, (6.80) 

and hence as a warm-up we first (re)prove Theorem 6.5 for n — %. Suppose we have 
a map V : H^(C) —K as in Definition 6.1. Write 

=V{aa0l20l2),^'’^ =V{l20C7b0h)At'’ =yih0h0CTc), 
where a,b,c can be 1,2,3. From Lemma 6.4 we then have 


y(di 0d2 0d2) 

— 2 3 (2) 1 (2). 

— ^2 ^2 ^3 ’ 

( 6 . 81 ) 

y(d2 0di 0d2) 

— Aj A2 A^ 9 

( 6 . 82 ) 

y(d20d20di) 

_ 2 (2) 1 (2) 1 (1). 

— Aj A2 A^ , 

( 6 . 83 ) 

y(di 0di 0di) 

II 

( 6 . 84 ) 
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Furthermore, the four operators on the left-hand side commute and turn out to satisfy 


a\ ® 02® 02- 02® Ol ® 02- 02® 02® Ol = -0\ (g) (7i (g) (7l, (6.85) 

SO that again by Lemma 6.4, 

Aj ^2 Aj -Aj A2 A3 -Aj A2 A3 — Aj A2 A3 , ( 0 . 60 ) 


i.e. = —1. Since X'j'’ = ±1, this is impossible, so that V cannot exist. 

Now, using the notation in the preceding discussion, consider the unit vector 


1 (0 _. 


v/'G//z = (|in)-|ooo))/V2, 


(6.87) 


which is a joint eigenstate of each of the four operators on the left-hand side of 
(6.81) - (6.84), with eigenvalue 4-1 for the first three, and hence eigenvalue —1 
for the fourth, i.e., ffi (g) (7i (g) (7i. So if setting A = a for Alice (where a G {1,2}) 
means that she measures F = Oa ® h ® h with outcome X^^^ = ±1, and similarly 
B — b for Bob and C = c for Cindy mean that they measure G = \ 2 ® Oti ®\2 and 
FI = l 2 (g)l 2 <g) 0 'c with outcomes = ±1 and = ±1, respectively, then in the 
state i/G/rz each of the settings gives the correlation 

settings (a,c) = (1,2,2), (2,1,2), (2,2,1) ^ = 1; (6.88) 

setting {a,b,c) = (1,1,1) X[‘‘'^X^"’X^'^ = — 1. (6.89) 

Theorem 6.14. The conjunction of the following assumptions is contradictory: 

• Determinism.' there is a state space X with associated functions 


A,B,C :X ^ {\,2),F,G,H :X ^ A, 


which completely describes the experiment, in thatxG X determines both settings 
{a,b,c) and outcomes (^ 1 ,^ 2 , A 3 ) G A^ through a = A(x), Xi = F{x), etc. 

• Nature: the experiment (performed in the state \j/GHz) hos possible outcomes in 
A ={—1,1}, subject to the correlations (6.88) - (6.89); 

• Freedom.' there is a further function Z ; A —>■ Xz, in terms of which 

F = F{A,B,C,Z), G = G{A,B,C,Z), H = H{A,B,C,Z), 
and F, G, FI, Z are independent, i.e. for each (a,b,c,z) there is x GX such that 
A{x)=a, B(x)=b, C{x)=c, Z{x)=z. 

• Locality.' F = F{A,Z), G = G{B,Z), and H H{C,Z). 

Proof. Using notation as in the proof of Theorem 6.13, for fixed z G Z we obtain 

F{a,z) = etc. Nature then leads to the contradiction derived after (6.86). □ 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



6.5 Bell’s theorems 


213 


6.5 Bell’s theorems 


Two different results are known as “Bell’s Theorem”: the first, from his paper in 
1964, is Theorem 6.15 below, and the second, dating from 1976, is Theorem 6.18. 
The first is similar to the Free Will Theorem in both its assumptions and its conclu¬ 
sion, and to make this similarity more obvious we first state it for instead of C^. 
The difference lies in the probabilistic flavour of Bell’s Theorem, whose empirical 
input is not given by the only non-probabilistic consequence to be drawn from the 
quantum-mechanical formulae (6.35) - (6.38), viz. the certainty (6.43) of perfect 
correlation on identical settings, but rather by the probabilistic formula (6.40), i.e., 

^ Gj\Ai = Ui^Bj = \j) = I sin^ {i,j = 1,2,3), (6.90) 

where 0u.v is the angle between two unit vectors u and v. Furthermore, the state 
space X must be upgraded to a probability space {X,Z,IJ.), carrying functions A 
and B (for the settings, which unlike Bell himself—who treated them as labels— 
we include among the random variables), F and G (for the outcomes) and finally Z 
(for the hidden variable traditionally called X) as random variables, i.e., measurable 
functions. This also implies that the target spaces X^ to Xz (which is traditionally 
called A) must be equipped with some (7-algebra of measurable subsets. But this is 
not a big deal, since Xa = Xb carries a natural Borel structure and Xp = Xq is finite. 
The probability measure /r is assumed independent of {A,B,F,G), and vice versa. 

The measure /r, which gives the “hidden state” of the system that allegedly un¬ 
derlies its quantum-mechanical description, is chosen in such a way that empirical 
probabilities (typically obtained from long runs of repeated measurements) are re¬ 
covered as joint conditional probabilities defined by pL and the random variables, 
i.e., assuming the settings {a,b) are possible in that P{A = a,B = b) >0, we put 


P{F = X,G = Y\A = a,B = b) 


P{F = X,G = Y,A = a,B = b) 
P{A = a,B = b) 


where the joint probabilities on the right-hand side are given by 


(6.91) 


P{A = a,B — b) = b{A = a,B = h}; (6.92) 

P{F = X,G = Y,A = a,B = b) = IJ.{F = X,G = Y,A = a,B = b}, (6.93) 


where /r(A = a,B = b) is shorthand for /r(x G X | A(x) = a,B(x) = b}, etc. This 
implies that /r depends on (but may not be determined by) the quantum state Xj/Q. 

On this understanding, the assumptions of Determinism and Locality are the 
same as for the Free Will Theorem (except that equations like F{x) = F{A{x),Z{x)) 
are merely supposed to hold almost everywhere with respect to jj.). Freedom is 
now taken to mean that (A,B,Z) are probabilistically independent relative to p. By 
definition, this also means that the pairs {A,B), (A,Z), and (B,Z) are independent, 
so that for any A C Xa, B C Xb, and (measurable) Z C Xz, defining 


P{A G A,B G B,Z G Z) = p(xGX \A{x) G A,B{x) G B,Z(x) G Z), (6.94) 
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and analogous expressions for P{A € A) and P{A G A,B G B), etc., we have 


P{AGA,BGB)=P{AGA)P{BGBy, (6.95) 

P{A gA,ZgZ)=P{Ag A)P{Z G Z); (6.96) 

P{BgB,ZgZ)=P{BgB)P{ZgZ)-, (6.97) 

P{A G A,B G B,Z G Z) = P{A G A)P{B G B)P{Z G Z). (6.98) 

If we finally define Nature as the claim that F and G are 2-valued and that 

P{Fi Gj\Ai = Ui,Bj = \j) = I sin^ {i,j = 1,2,3), (6.99) 


where the left-hand side is the conditional probability defined by /r and the random 
variables in question (whereas the left-hand side of (6.90) is the empirical probabil¬ 
ity for the experiment in question, or, equivalently, the quantum-mechanical predic¬ 
tion thereof), then we obtain the following spin-1 version of Bell’s first theorem: 

Theorem 6.15. Determinism, Freedom, Nature, and Locality are contradictory. 

This formulation is literally the same as Theorem 6.13, but the terms have acquired a 
different technical meaning now, especially Freedom and Nature. Moreover, purists 
would add Probability Theory as an assumption in Bell’s Theorem, as its formalism 
is decidedly non-tautological and its interpretation is far from obvious, even in a 
classical setting. In any case, the proof is practically the same as in the more familiar 
optical version of the EPR-experiment, to which we now turn. 

In the classical (sic) form of the experiment, Alice and Bob perform measure¬ 
ments on incoming photons by letting them pass through a polaroid glass whose 
axis of polarization makes angle a (Alice) or b (Bob) with (say) the horizontal axis 
in the plane orthogonal to the direction of propagation of the photons. Considered 
in the light of the previous experiment on spin -1 particles, such a choice of settings 
may also be seen as a choice of basis for with the proviso that, assuming (by 
convention) the photons move along the y-axis, one basis element U 2 = ( 0 , 1 , 0 ) is 
fixed so that the remaining two vectors (ui,U 3 ) must lie in the x-z plane (in which, 
on a naive picture, the photons may “vibrate”). This constraint gives rise to bases 

m = (cosa,0,sinfl),U2 = (0,1,0),U3 = (—sinfl,0,cosfl), (6.100) 

the first of which (say) gives the actual direction of the axis of polarization. In any 
case, Alice writes down F = 1 if her photon passes her glass at angle a, and F = 0 
if it does not; similarly Bob writes G—1 (pass) or G = 0 (fail) at setting b. 

In a quantum-mechanical description of the experiment, the Hilbert space of the 
photon pair is 0 C^, and the correlated photon state is taken to be 

I/O = (ei 061-l-e2C)e2)/v^, (6.101) 

where ei = (1,0) and 62 = (0,1) form the standard basis of C^. The probabilities 
(6.35) - (6.38) as predicted by quantum mechanics are now replaced by 
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P\l/oiF = 1,G= 1|A =a,B — b) = \co&^{a — b)-, (6.102) 

P\I/q{F = 0, G = 0|A = a,B = b) = \ co&^{a — b)-, (6.103) 

P\l/o{F = 1,G = 0|A =a,B — b) = \&\r^{a — b), (6.104) 

PvoiiF = 0, G = 1|A = a,B = b) = ^ sm^{a — b), (6.105) 

which are also the experimentally measured ones. Instead of (6.90) we then obtain 

P\I/q{F ^ G|A = a,B — b) = sin^(a — fo); (6.106) 

P\I/q{F = G\A =a,B = b) = cos^{a — b). (6.107) 


In particular, if their settings are the same (i.e., a = b), then Alice and Bob will 
always find the same outcome (perfect correlation), whereas in case they are or¬ 
thogonal (i.e., a = b± n/2), they obtain perfect anti-correlation, in that Alice’s 
photon passes whenever Bob’s is blocked, and vice versa. However, this will not be 
used. Although it should be obvious from the previous case what the assumptions 
in Theorem 6.15 mean for this particular experiment, we make them explicit: 

• Determinism means that there is a probability space with associated 

(measurable) functions 

A:X^ [0,;?r],B:X^ [0,7r],F :X ^ {0,1},G:X ^ {0,1}, (6.108) 

which completely describe the experiment in the sense that x G X determines 
both its settings a = A{x),b — B{x) and its outcomes X = F{x),"/ = G{x). 

• Freedom stipulates that there is a (measurable) function Z\X ^Xz such that: 

- F =F{A,B,Z)tmdG = G{A,B,zy, 

- {A,B,Z) are probabilistically independent relative to jj.. 

• Locality means that F{A,B,Z) — F{A,Z) and G{A,B,Z) = G(B,Z). 

• Nature states that the empirical as well as theoretical probabilities (6.106) for the 
experiment are reproduced as conditional joint probabilities given by p. through 

P{F ^G\A = a,B = b) = sin^ia-b). (6.109) 

Theorem 6.15 then holds verbatim for this situation, with the following proof. 

Proof. Determinism and Freedom imply 

P(F = X,G=r\A = a,B ^ b) ^ Pabz{F = X,G=r\A = a,B = b), (6.110) 

where we use the notation (6.50) - (6.51), the function A : X^xXb y. Xz ^ X^ is 
projection on the first coordinate, likewise the function B : Xa x Xg x Xz ^ Xb is 
projection on the second, and Pabz is the joint probability on XaxXbX Xz induced 
by the triple {A,B,Z) and the probability measure /r; by independence, Pabz is a 
product measure on^G xXbX Xz. According to Locality, F{a,b,z) does not depend 
on b, whilst G{a,b,z) does not depend on a. 
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For fixed settings {a,b), we may therefore define the following functions on Xz'. 

Fa{z)=F{a,z)-, (6.111) 

Gb{z) = G{b,z). (6.112) 


A brief computation then yields 

Pabz{F = A,G = 7|A = a,B = b)=Pz{Fa =X,Gb = 7 ), (6.113) 

where Pz is the joint probability on Xz defined by Z and jj.. Therefore, from (6.110), 

P(^F = X,G = r\A = a,B = b)= Pz{Fa =X,Gb = 7 )- (6.114) 

Nature then gives the crucial result 

Pz{Fa^Gb) = ^iv?{a-b). (6.115) 

Lemma 6.16. Any four {0, l}-valued random variables {F\,F 2 ,G\,G 2 ) satisfy 

P{Fiy^Gi)<P{Fi^G2)+P{F2fGi)+P{F2fG2). (6.116) 

This lemma (said to go back to Boole) is very easy to prove directly, but for com¬ 
pleteness’s sake we mention that it also follows from Proposition 6.17 below. 

Taking Fi — Fa ^, F 2 = Fa 2 , Gi = Gb ^, G 2 = Gb 2 ^ ^nd P = Pz, for suitable values of 
{a\,a 2 ,b\,b 2 ) this inequality is violated by (6.115). Take, for example, a 2 = b 2 = 3x, 
fli = 0, and bi = x. The inequality (6.116) then assumes the form f{x) > 0 for 

f{x) = sin^(3x) + sin^(2x) — sin^(x). 



Graph of x 1 —>■ sin^(3x) + sin^(2r) — sin^(x), showing (in the region where it is 
negative) that quantum mechanics violates the Bell inequality (6.116). 
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Lemma 6.16 is a special case of a more general result. 

Proposition 6.17. Let Fi : X ^ [—1,1] and Gj : X ^ [—1,1], where {X,Z,n) is 
some probability space, be two parametrized random variables, i, j = 1,2. Then the 
two-point function {FiGj) = J^dpFiGj satisfies the CHSH-inequality 

KFiGi) + {F 1 G 2 ) + {F 2 G 1 ) - {F 2 G 2 )\ < 2. (6.117) 

If Fi and Gj just take the values ±1, then (6.116) is a special case of (6.117). 

Proof. In terms of the function ^ —F\- (Gi + G 2 ) +L 2 ■ iG\ — G 2 ), we may write 

{F\G{) + {F 1 G 2 ) + {F 2 G 1 ) — {F 2 G 2 ) = [ dll4>. (6.118) 

Jx 

Since |/ 7 (x)| < 1 and |Gy(x)| < 1 by assumption, we have |0 (x)| <2 and hence 

[ dll{x)<i>{x) < [ dp{x)\<P{x)\ <2, (6.119) 

Jx Jx 

since /r is a probability measure. To prove the the last claim, we just note that 

P{Ft = Gj)-P{F,fGj) = {F,Gjy, 

P{Ft = Gj)+P{FtfGj) = l. □ 

In Bell’s second (1976) theorem on stochastic hidden variables, the assump¬ 
tion of Determinism is dropped, and all we have is a theory stating conditional 
probabilities P{F = X,G = yjA = a,B = b,x) for the outcomes of the above bi¬ 
partite experiment given some hidden variable x, as well as the single-wing versions 
P{F = X\A = a) and P{G = y\B — b,x). Here F,G,A,B are just notational devices 
to record such outcomes, which are no longer (necessarily) represented as random 
variables. On this new understanding of the notation, the Nature assumption is for¬ 
mulated just as before, cf. (6.109). We do assume the existence of a probability 
space {X,X,ii) and of conditional probabilities 

P(F = X,G = Y\A = a,B = b,x), P(F = X\A = a,x), P{G=y\B = b,x), 

defined /r-a.e. in x, in which the state of the world is specified as being x G X. In 
terms of this space, the Freedom assumption means that 

P{F = X,G=Y\A = a,B^b)= [ dixix)PiF = X,G = Y\A = a,B = b,x), (6.120) 

Jx 

for any settings {a,b), of which p is independent (as the notation already indicated). 
The crucial assumption replacing Determinism is Bell locality, which reads 

P{F = X,G = Y\A = a,B = b,x)= P{F = X\A=a,x)-P{G = y\B = b,x). (6.121) 

Bell’s second theorem for stochastic hidden variable theories reads as follows. 
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Theorem 6.18. Nature, Freedom, and Bell locality are contradictory. 

Proof. The idea of the proof is to introduce an artificial probability space in order 
to recover the framework of Theorem 6.15. To this end, we take 

1 = [0,1] X [0,1] xX; (6.122) 

dfl{s,t,x) = ds■ dt ■ dlJ.{x). (6.123) 

where we denoted the elements of X by {s,t,x). On X, define random variables 

Fa{s,t,x) = l[ 0 ,/>(F=l|A=ux)](s); (6.124) 

Gb{s,t,x) = l[o,/>(G=i|B=i>x)](Oi (6.125) 

where 1 a is the indicator function for A C [0, Ij. Writing, as usual, 

PiFa=^,Gb = Y)= / djl{s,t,x){{s,t,x) GX \Fa{s,t,x) = X,Gbis,t,x) = Y}, 
Jx 

we obtain (first for A = 7 = 1 , from which the other cases follow): 

P{Fa=X,Gb = Y)= f dixix)PiF = X\A = a,x)-PiG = Y\B^b,x). (6.126) 

Jx 

With Freedom and Bell locality, this yields 

p{F = X,G=y\A = a,B = b)= P{Fa =X,Gb = 7 ), (6.127) 

as in (6.114), so that the proof may be completed as for Theorem 6.15. □ 

Let us note that since in Bell’s second theorem the settings {a,b) are treated as free 

parameters to begin with, the difference between X and Z evaporates, so that in the 
above formulae one might as well have replaced ( 2 f,/r) by the space {Xz,Bz) that 
describes all relevant degrees of freedom except the settings (i.e., the experimental¬ 
ist, in either human or machine form). Either way. Bell’s locality condition may be 
disentangled into the following conditions (introduced by Jarrett and Shimony): 

1. Parameter Independence {vif. 

P(X\a,b,x) = P{X\a,x)', (6.128) 

P{Y\o:b,x) = P{Y\b,xy, (6.129) 

2. Outcome Independence ( 01 ): 

P{X\a,b,Y,x) = P{X\a,b,xy, (6.130) 

P{Y\a,b,X,x) = P{Y\a,b,x), (6.131) 

where we have abbreviated P{F = X\A = a,B = b,x) by P{X\a,b,x), etc., and have 
used the following notation (which states identities in case one has (6.91) - (6.93)): 
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P(X\a,b,x) = 

r 

P{y\a,b,x) = Y^P{X,Y\a,b,x)\ 
X 

P{Y\a,b,x) 


P{Y\a,b,X,x) = 


P{X,Y\a^b,x) 
P{X\a,b,x) ’ 


(6.132) 


(6.133) 


(6.134) 


(6.135) 


It is easy to see that Bell locality is equivalent to the conjunction o/PI and 01. 

Note that the former (Pl), akin to Locality, is a hidden or ‘subsurface’ version of 
the no signaling property of the ‘surface’ probabilities, which states that 

P{X\a,b) = Y,P{'>^,7\a,b) 


is independent of b (and vice versa). But a violation of PI only leads to signaling if x 
can be operationally controlled, similar to the way in which experimental physicists 
prepare quantum states \j/. Hence it is reassuring that quantum mechanics satisfies 
PI if we see the quantum state y/ as a hidden variable: assuming 


P{X,Y\a,b,x) = P^,g{F = X,G = y\A = a,B ^ b), 


(6.136) 


as computed in (6.102) - (6.105), PI is valid but Ol is not. First, for A = 0 or A = 1, 

P{X\a,b,x) = ^ Pxf,g{F = X,G = Y\a,b) = ^cos^{a — b) + lsm^{a — b) = ^, 
r=o,i 

(6.137) 

which is independent of b, and likewise P{Y\a,b,x) = j, independently of a. This 
yields PI, which a similar computation shows to be true for any quantum state. On 
the other hand, given this result, 01 would require 

P^^{F = A,G = 7|A = a,B = b)= P^^{F = A|A = a)-P^^(G = y\B = b), 


which is false, since by (6.102) - (6.105), Alice’s and Bob’s outcomes are correlated. 

Hence Bell locality is violated by quantum mechanics, but this does not imply 
that “quantum mechanics is nonlocal” (as some say). Bell’s is a very specific locality 
condition invented as a constraint on hidden variable theories. In another important 
sense, viz. Einstein locality, quantum mechanics is local, in that observables with 
spacelike separated localization regions commute (this is the case in quantum field 
theory, but also in any bipartite experiment of the type considered here, where Al¬ 
ice’s operators commute with Bob’s just by definition of the tensor product). 

On the other hand, deterministic theories, which in the present context are defined 
as those for which all conditional probabilities like P{X,Y\a,b,x) are either zero or 
one (in which case one may introduce random variables reproducing these probabil¬ 
ities), violate PI but satisfy OI, at least if they reproduce the Born probabilities (such 
as Bohmian mechanics). Hence such theories violate Bell locality. 
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Finally, Bell-type inequalities like (6.117) also give information about quantum 
mechanics itself, particularly about the degree of entanglement of states. Let Hi and 
be Hilbert spaces, with tensor product Hi ^ H 2 . A unit vector xj/ G Hi ^ H 2 is 
called uncorrelated if it is of the form xj/ — (pi^ (p 2 , where (pi^ G 74 are unit vectors, 
k = 1,2, and correlated otherwise. Clearly, the vectors (6.34) and (6.101) used in 
the experiments so far are correlated. The simplest result is then as follows. 

Theorem 6.19. Let ai and 02 be self-adjoint operators on Hi, and let bi and b 2 be 
self-adjoint operators on H 2 , each with spectrum contained in [—1,1] (equivalently 
ll^ull < 1, etc.}. Let xj/ be a unit vector in Hi 0/4. and define two-point functions 

(FiGj) = {xi/,ai^bjXi/). (6.138) 


Ifxff is uncorrelated, then the Bell inequality (6.117) holds. 

Proof. This follows from the factorization property 

(FiGj) = {(pi(g)(p2,ai(S)bj(pi(Si(p2) = {(Puai(pi)-{(p2,bj(p2) = {Fi)-{Gj), (6.139) 

where (4) = {(pi,ai(pi) and (Gj) = {(p 2 ,bj(p 2 ). For either sign, this property yields 

{F2{Gi -G2)) = {F2){Gi){\ ± {Fi){G2)) - (F2)(G2)(1 ± {Fi){Gi)). (6.140) 

The spectral assumption implies that j (4) j < 1 and |(G_,')| < 1, which will be used 
directly below, as well as its consequence |1 ± (4)(G2)| = 1 ± (4)(G/). Hence 

|(F2(Gi-G2))| < |1±(4)(G2)| + |1±(4)(Gi)| 

= 1±(4)(G2) + 1±(4)(Gi) 

= 2±(4(Gi+G2)). (6.141) 

Similarly, 

I(4 (Gi + G 2 )) I < 2 ± (4(Gi - G 2 )), (6.142) 

so that, writing <l> = (4Gi) + ( 4 G 2 ) + {F 2 G 1 ) — {F 2 G 2 ), for either sign ± we have 

|^|<|(4(Gi+G2))| + |(4(Gi-G2))|<4±0 (6.143) 


If 0 > 0 we choose the minus sign, whereas for ^> < 0 we take the plus sign. Either 
way, we obtain |0| <2, which is the inequality (6.117). □ 

This result is actually much more general (as hinted at by the way that the proof 
only uses the uncorrelated vector state xif — (pi® ^ 2 )- The simplest generalization 
is to replace pure states by mixed states, where we say that a density operator p 
on Hi ®H 2 is uncorrelated if it is of the form p = Y.iPiPi ^ P 2 , where the p, are 
probabilities and p^- is a density matrix on 74. b= 1,2. Then all uncorrelated density 
matrices satisfy the inequality (6.117). Even more generally, uncorrelated states on 
C*-algebras or von Neumann algebras A®B satisfy (6.117), see Notes. 
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6.6 The Colbeck-Renner Theorem 

One may try to strengthen Bell’s second theorem by weakening its assumptions. A 
remarkable result in this direction states that, roughly speaking, any probabilistic 
hidden variable theory that satisfies Freedom and Parameter Independence and is 
compatible with quantum mechanics adds nothing to quantum mechanics. In other 
words, it appears that quantum mechanics “cannot be extended’’, or “is complete”. 

In fact, the result turns out to be more modest than this summary suggests, since 
the reasoning required to prove the claim hinges on certain assumptions which are 
satisfied by quantum mechanics itself, but might seem unnatural for a hidden vari¬ 
able theory. In any case, we have to state our notation and assumptions very clearly. 

Definition 6.20. A hidden variable theory underlying quantum mechanics con¬ 
sists of a measurable space {X,Z) whose points x label conditional probabilities 

P{a\ = Ai,. ..,a„ = Xn\x) = P{a = X\x) 

for the possible outcomes X = {X\,...,X„) of a measurement of any family a = 
(ai,... ,a„) of n commuting self-adjoint operators on any Hilbert space H. 

These formal conditional probabilities are a priori only supposed to satisfy 

0<P(a = A|x) < 1; (6.144) 

Y^P{a = X\x) = l. (6.145) 

A 

Apart from these probabilities, for each Hilbert space H and any pure state e G 
(H), the theory SX yields a classical state fXe, i-e-, o probability measure on X. 

As the notation indicates, Pe depends on e only and hence is independent of a and 
X. From the point of view of If, a quantum state is a probability measure on A! In 
what follows we assume for simplicity that H is finite-dimensional, so that e = e^i/ 
for some unit vector y/ G H. With slight abuse of notation we then write for 
An important special case will be the bipartite setting H — H\ ®H 2 , where Alice 
and Bob measure self-adjoint operators X and Y on Hi and H 2 , respectively, so that 

n = 2, ai =X^ 1 h 2 , 02 = Ihi^Y. 

We then introduce settings c = {a,b), as in the previous sections, so that we typically 
look at expressions like P{Xa = Ai, = X 2 \x). The other case of interest will simply 

be n = 1 with ai = a, Xi = A; indeed, this will be the case in the statement of the 
theorem (the bipartite case playing a role only in the proof, though a crucial one!). 
The following notation will be quite important to the argument. An equality 

P^(a = A|x) = a(x), (6.146) 

where a : A —[0,1] is measurable (often even constant), abbreviates: 

P{a = A |x) = a{x) for almost every x with respect to the measure p^,. 
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That is, there is a subset X' CX such that /f y,(X') = 0 and = X\x) = a{x) holds 
for any x G X\X'. If X is finite, this simply means that the equality holds for any x 
for which /ry,({x}) > 0. Since this notation may render equalities like 

Py,(a = Alx)=P^(a' = A'lx), (6.147) 

ambiguous, we explicitly define (6.147) as the double implication 

Ptff(a = X\x) = a{x) P(f>{^ = X'\x) = a{x). 

Furthermore, for e —0 we write 

P^,{a = X\x) «P,p(a' = X'\x) <^P^,{a = X\x) =P,p(a' = X'\x) + 0{'/e), (6.148) 
as well as 

V/« (1-e) < |(Vf,^)| < 1- (6.149) 

We are now ready to state our assumptions for the Colbeck-Renner Theorem: 

• Compatibility with Quantum Mechanics (CQ): for any unit vector xj/ G H, 

J dpxf,{x)P{a = X\x) = p^,{a = X), (6.150) 

where the quantum-mechanical prediction p^,{a = X) is given by the Born rule 
p^{a = X) = (v/,4i’---4!V), (6.151) 

cf. ( 2 . 21 ), where is the spectral projection on the eigenspace H^. C H of a,. 

• Unitary Invariance (UI): for any unit vector xj/ G H and unitary u on H, 

Puxfii^ = X\x)= Pxf,{u^^au = X\x). (6.152) 

£ £ 

• Continuity of Probabilities ( CP: If y/ « then Py, (a = A |x) ~P^{a — X\x). 

In the remaining axioms, H — Hi ®H 2 , and a and b are self-adjoint operators on H\ 
and i/ 2 , respectively (duly identified with operators a 0 17/2 and l//j 0 i? on //). 

• Parameter Independence (PI): 

P{a = X,b = y\x) =P{a = X\x)-, (6.153) 

yeaib) 

Y P{a = X,b = y\x) =P{b = y\x). (6.154) 

X€o{a) 

• Product Extension (PE): for any pair of states xj/i G Hi, xj/2 G i/ 2 , 

Py,j(a = A|x) = Pxi,^0xir^{a = X\x) . (6.155) 
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• Schmidt Extension (SE): if Vi &H\ (/=!,... ,dim(//)) are eigenstates of a, then 
for arbitrary orthogonal states m, G H 2 and coefficients c, >0 with Y.i = 1, 

IY-Ci-Vi{a = x\x) = Pi_iCrVi®Ui{a = x\x). (6.156) 

Note that PI makes sense, because (6.151) and (6.150) imply that for = X) 
to be nonzero we must have Xi G C7(a,) for each i. All assumptions are satisfied by 
quantum mechanics itself (seen as a hidden variable theory with \j/ as the “hidden” 
variable x). In the context of hidden variable theories, though, one might doubt the 
plausibility of UI, CP, and SE. But we need all these assumptions to prove: 

Theorem 6.21. If SX satisfies CQ, UI, CP, PI, PE, and SE, then for any (finite- 
dimensional) Hilbert space H, unit vector Xf/ G H, and operator a G B(H)sa, 

Py,(a = X\x) = Pfi/{a = X). (6.157) 

In other words, the hidden variable x is even more hidden than expected, since know¬ 
ing its value has no effect on the probabilities for the outcomes of experiments. 

Proof We first assume (without loss of generality) that a is nondegenerate as a self- 
adjoint matrix, in that it has distinct eigenvalues (Ai,..•,this assumption 
will be removed at the end of the proof. The proof consists of three steps. 

1. The theorem holds for H = CX and any pair («,(/) for which 

p^(a = Ai) =p^(a = A 2 ) = 1/2, (6.158) 

This only requires assumptions CQ, PI, and SE. 

2. The theorem holds for H = C\ I < °° arbitrary, and any pair (a, Xj/) for which 

Pff,{a = Ai) = • • • = = Xi) = \/l. (6.159) 

This is just a slight extension of step 1 and uses the same three assumptions. 

3. The theorem holds in general. This requires all assumptions (as well as step 2). 

Proof of step 1. Let H = C^, with basis (ui, V 2 ) of eigenvectors of a, so that Xj/ gC^ 
may be written as 

xj/=(vi-GV 2 )/V 2 . (6.160) 

Without loss of generality, we may assume that Xi = 1 and A 2 = — 1. We now relabel 
a as flQ and extend it to a family of operators {ak)k=o.i 2N-i by fixing an integer 
N > I, putting 9k = kn/2N, and defining 


‘^k = eek+^-ee„ (6.161) 

where, for any angle 9 G [0,27r], the operator eg = |0)(0| is the orthogonal projec¬ 
tion onto the one-dimensional subspace spanned by the unit vector 

|0) = sin( 0 / 2 )-i;i-fcos( 0 / 2 )-t) 2 . (6.162) 





224 


6 Classical models of quantum mechanics 


In the bipartite setting, we have operators = Ck^li and bk = I 2 <8> on 0 C^, 
as well as a maximally correlated (Bell) state Yab G 0 C^, given by 


Wab 


1 


(Ul 0 Ul + t)2 ®'f>2)- 


(6.163) 


Using assumptions PI and SE, we then have, for i = 1,2 Xi = 1, and A 2 = — 1, 


P^(fl = Xi\x) = Pxi/^giao = Xi\x). (6.164) 

The quantum-mechanical prediction is 

PvAsi^o = i)=PvABi^o = -1) = f (6.165) 

Our goal is to show that also for each x GX, knowing x is irrelevant in that 

PvAsi^o = l\x) = P^,^g{ao = -l|x) = i. (6.166) 

To this effect we introduce the combination of probabilities 

/W(x)=P(ao = fe2iV-i|x)+ Y. P{ak^bi\x), (6.167) 

, I/:—/1 — 1 

where = {0,2,... ,2A^ — 2} and La^ = (1, 3,... ,2N — 1}. Our first inequality is 

\P{ak = Xi\x)-P{bi = Xi\x)\ = \P{ak = Xi,bi = Xi\x)+P{ak = Xi,bi ^ Xi\x) 

- P{ak = Xi,bi = Xi\x) +P{ak ^ Xi,bi = Xi\x)\ 

= \P{ak = Xi,bi ^ Xi\x)-P{ak Xi,bi = Xi\x)\ 

< P{ak = Xi,bi ^ Xi\x) +P{ak ^ Xi,bi = Xi\x) 

= P{ak^bi\x), (6.168) 


where i= 1,2, and we used PI. This implies a second inequality: since a 2 N = —ao, 

|P(flo = 1 k) - P(ao = -1 |x) I = |P(flo = 1 \x) - P{a2N = 1 k) I 

< Y \Pi°k = Mx)-P{bi = l\x)\ 

k,L\k-l\ = l 

< Y ^ 1-^) - W- 

C/,|A:-/| = 1 


Integrating this with respect to the measure and using CQ gives 
j^dli^^g{x)\P{ao = l|x)-P(ao =-l|x)| < ^(x)(x) = 7 ^ 5 . (6.169) 
We wish to invoke the corresponding quantum-mechanical expression, defined by 
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^VaI= P¥ABi^0 = b2N-i)+ Y. PwAsioky^bi). (6.170) 

,l^L^ ,\k— l\ = l 

A Straightforward calculation shows that this expression is equal to 

ll^J^=2Nsm^{n/4N). (6.171) 

Since liniAr^oc/^g = 0, letting oo in (6.169) therefore yields (6.166). From 
(6.164) we then obtain (6.158). 

Proof of step 2. Let H = C^ and let (D()Li be an orthonormal basis of eigenvectors 
of a, with corresponding eigenvalues A,, and phase factors for the eigenvectors Vi 
such that c, > 0 (and of course, ^, c? = 1) in the expansion 

V/ = ^Qt;,-. (6.172) 

i 

The case of interest will be ci = • • • = c; = 1/Z, but hrst we merely assume that 
Cl = C 2 (the same reasoning applies to any other pair), with Ai = 1 and A 2 = — 1 
(which involves no loss of generality either and just simplifies the notation). The 
other positive coefficients c, are arbitrary. Generalizing (6.166), we will show that 

P^f,{a = \\x) ^ P^f,{a =-\\x). (6.173) 

This shows that if two Born probabilities dehned by some quantum state are 
equal, then the underlying hidden variable probabilities must be equal /ri|/-a.e., too. 
Eq. (6.159) immediately follows from this result by taking all c, to be equal. 

As in step 1, we pass to the bipartite setting, introducing two copies of H = <C^ 
denoted by Ha= Hg = C\ and dehne the correlated state 

= (6.174) 


in Ha ®Hb. Eq. (6.164) again follows from assumptions PI and SE. Throughout 
the argument of step 1 , we now replace each probability = yfx) by an 

adapted probability P^^'^ (a^ = = yfx), dehned as the conditional probability 


= 72 | x ) = P{ak = Xi,bi = Y2\\Xi\ = Inl = i,x) 
_ P{ak = Xi,bi = 72, |A,j = I72I = l|x) 
P(|A,-| = |^| = l|x) 


(6.175) 


for all X for which P(|A,j = |')^| = 1 |x) > 0 , whereas 


P^^\ak = Xi,bi = 72 | x ) =0 


(6.176) 


whenever P(|A,j = I 72 I = l|x) = 0. The same argument then yields (6.169), with P 
replaced by P^^'> but with the same right-hand side. As in step 1, 
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{ao = 1 |x) = (ao = -1 |x), (6.177) 

which implies that 

^vt4s(“o = 1W =-Pi«4s(flo = -1W, (6.178) 

either because both sides vanish (if P(|A, | = I 72 I = 1 |x) = 0 ), or because (in the op¬ 
posite case) the denominator P(IA,'I = I 72 I = l|x) cancels from both sides of(6.177). 
Combined with (6.164), eq. (6.178) proves (6.173) and hence establishes step 2. 

Proof of step 3. This is the most difficult step in the proof, relying on a technique 
wittily called embezzlement (which we only need for maximally entangled states). 
We will deal with three Hilbert spaces, namely H = C‘, H' = C™, and H" = C" 
(where n = for some large N, see below), each with some fixed orthonormal 
basis ('l^p7=i’ (^r)i=i’ respectively. Given a further number m; < m, 

we now list the nm basis vectors v'l ® d'- of H" ®H' in two different orders: 


1. t;;'(g)t;;,...,<(g)t;;,t;;'(g)t;^,...,<(g)t;^,...,t;;'(g)<(g)t;;; 

2 . 


where the remaining vectors (i.e., those of the form v” ® Vj for \<k<n and j > mf) 
are listed in some arbitrary order. 

Define 

: H" (g) H'^ H" (g) H' (6.179) 

as the unitary operator that maps the first list on the second. We will need the explicit 
expression 

= v” on'-, (6.180) 

■'* 4 

where for given k = the numbers s[ = (where n, is the smallest 

integer such that n,m, > n) and are uniquely determined by 


k={si-l)mi + i. (6.181) 

We will actually work with two copies of H" ® H', called Pl'l ® H'^ and 0 Hg, 
with ensuing copies of u'^'^ and of and hence, leaving the isomorphism 

(6.182) 

implicit, we obtain a unitary operator 

O4'"'^ : (6.183) 

The point of all this is that the unit vector 

(6.184) 

Kn = vl ® <, (6.185) 

\ JC \ n ) <-=1 
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where C(n) = Y!k=\ ^ “catalyst” in producing the maximally entangled 

state 


(p G 

1 

/m 


(P = 




from the uncorrelated state dJ 0 uj G in that for any m, < m. 


(6.186) 

(6.187) 


K„<S)(p. (6.188) 

Here e = 1/A? if n = . This follows straightforwardly from (6.183) - (6.187). 

After this preparation we are ready for the proof of step 3, continuing to use the 
notation established at the beginning of step 2, especially (6.172). As in step 1, we 
introduce two copies Ha—Hb = of H, as well as two states 


Vab (6.189) 

i 

\i/ab = k-„ 0 0 -uJ 0 y/AB G H'a^H'b, (6.190) 

where K„ is given by (6.185), we put 

H'" =H'' (6.191) 


and in our notation we have ignored the obvious permutations of factors in the tensor 
product. For any e > 0, pick c- G K+ such that {c[Y G Q+ and 


|c'-q| <e/dim(7/), (6.192) 


which implies that, in the sense of (6.149), we have 


Suppose 


with pi,qi G N and gcd(p;,^,) 


Consequently, writing 


l^CiVi ^ l^aVi. 
i i 

(6.193) 

c'i = ^Pilqi, 

(6.194) 

1, and define 


mi=PiY\qi'- 

i’^i 

(6.195) 

II 

1 

(6.196) 


the following quotient is independent of i: 
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Given the integers m, thus obtained, we define a unitary operator 

u ; H'” ^ H'”\ 

i=l 


(6.197) 


(6.198) 

(6.199) 


where is defined in (6.180). From this definition, with additional labels to de¬ 
note the copies ua : ^ and ug ; Hg , and (6.188), and writing 


^0 = Vi®v'j G //(g)//', (6.200) 

with corresponding copies 

( 6 . 201 ) 

( 6 . 202 ) 

we then obtain the important relations 

/ 


Ih'ji (g) l/r"'(V/4B) 
Ua (g) ^h’^'{Wab) 
^h'^®^b{Wab) 
Ua (g) usiy^AB) 


— (Si ; 

i=l 


l tni 

q-Kn®Y. E '^AA' ®^BB'- 

'=iii=i 


(6.203) 

(6.204) 

(6.205) 

(6.206) 


Here the right-hand sides of (6.203) - (6.206) have been arranged so as to obtain 
vectors in the six-fold tensor product 

H'l (g) ®HA®H'A®HB®H'g. 


We will repeatedly invoke the following lemma, whose proof just unfolds the 
notation (on the appropriate identification of a with aig) 1//^ and of b with l//j ®b). 

Lemma 6.22. Assume PI and UI. For any pair of unitary operators u\ on Hi and 
U 2 on H 2 , and any unit vector yf £ Hi® H 2 , one has 

= 7\x) = Pwib = rW; (6.207) 

P{lH^0U2)w(^ = = P^r{^=x\x). (6.208) 
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Since we assume that a is nondegenerate, there is a bijective correspondence 
between its eigenvalues a = A, and its eigenvectors u,. Instead of P{a = A,) dressed 
with whatever parameters x or \j/, we may then write P{Vi), where a is understood, 
and analogously for the more complicated operators on tensor products of Hilbert 
space appearing below. Repeatedly using Lemma 6.22, we proceed as follows. 


• From Step 2, using the notation explained below (6.172), 

Pyl c‘Ji ■ 


(6.209) 


• From (6.156) in PE and (6.209), 

• From (6.155) in and (6.210), 




‘’H ^AA' ^^BB' 

• From (6.211), CP (whose notation we use), and (6.206), 


( 6 . 210 ) 


( 6 . 211 ) 


P(uAlglUBWAB^^BB'\^'i ~ ^ ■ ( 6 . 212 ) 


• Recall the number m (satisfying m > nii for all i). From (6.212) and Lemma 6.22, 


W ^ Ui = 1, ■ ■ 

P(\„,n®UBWj:B^^BB'\’^) ^ 0 Ui = m; + 1,... ,m). (6.213) 

We now start a different line of argument, to be combined with (6.213) in due course. 

• From PE, SE, and (6.172), with G Ha denoting u, G H, we have 

Py/ia = Xi\x) = Pii/{Vi\x) (^Ak)- (6.214) 

• Using Lemma 6.22, (6.203), and (6.204), 

(6.215) 

and hence 

Py,{a = A,lx) = k). (6.216) 

• From quantum mechanics, notably (6.151), and (6.205), for any i' ^ i we have 

P{ijjm0UB)y/^g(^A ® (6.217) 

• From CQ and (6.217), for any i' ^ i. 
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(6.218) 

• From PI, 

pivi\x) = YPi4,&y, 

(6.219) 


iji 



p&x) = Ypi^'L&)- 

(6.220) 


• From (6.218), (6.219), and (6.220), 

^{IuIii®ub)Vab^'^a\^) ~^^{IuIii®ub)Wab^^BB'\^^ ■ ( 6 . 221 ) 

ji ^ 

Finally, from (6.214), (6.221), (6.213), and (6.197) we obtain 

P^(fl = Xi\x) Ki^q^ = mi-q^ = cj. (6.222) 

li 

Since c,- > 0 we have cj = |q|^; using (6.192) and letting e —> 0 then proves step 3: 

P^,{a = Xi\x) = |c,'p = pxf,{a = Xi). (6.223) 

Finally, we remove our standing assumption that the spectrum of a be nondegen¬ 
erate. In the degenerate case one has 

p^{a=Xi) = Y^p^,{Vji), (6.224) 

Ji 

where the sum is over any orthonormal basis {Vj-)j. of the eigenspace of A,. Simi¬ 
larly, since each vector Vj- gives a = Xi, probability theory gives for all x, 

P{a = Xi\x) = \x)- (6.225) 

Ji 

The nondegenerate case of the theorem (which distinguishes the states Vj.) yields 

Px,,iVji\x)=pif,{Vj.), (6.226) 

from which (6.157) follows once again: 

Pvia = ^i\x) = YPvi'^jM) = YPvi'^J.) = 

ji Ji 

Our proof of the Colbeck-Renner Theorem is now complete. □ 

Under less stringent assumptions this theorem might have been regarded as the 
conclusion of von Neumann’s program to disprove the possibility of completing 
quantum mechanics by adding hidden variables, but as yet this seems unwarranted. 
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Notes 

§6.1. From von Neumann to Kochen-Specker 

‘For decades nobody spoke up against von Neumann’s arguments, and his conclusions were 

quoted by some as the gospel’. (Belinfante, 1973, pp. 24) 

Theorem 6.2 is due to non Neumann (1932, §1V.2); it was the first result to impose 
useful constraints on hidden variable theories, anticipating all later literature on the 
subject. Unfortunately (as part of their general anti-Copenhagen rhetoric), Bell and 
his followers left the realm of decent academic discourse by calling von Neumann’s 
arguments against hidden variables ‘silly’ and ‘foolish’, through which they merely 
displayed the depth of their own misunderstanding of von Neumann’s reasoning; see 
Caruana (1995), Bub (2011a), and especially Dieks (2016b). In fact, von Neumann 
(1932, p. 172) carefully qualifies his Theorem 6.2 by stating that it follows ‘im Rah- 
men unserer Bedingungen’ (i.e. ‘given our assumptions’), of which he earlier (on 
p. 164) admits that linearity is physically reasonable only for commuting operators, 
but nonetheless justifies this assumption through an ensemble argument (now out¬ 
dated, but by no means ‘silly’). Though couched in agreeable academic parlance, 
the earlier critique by Hermann (1935) was misguided, too (Dieks, 2016b). 

The Kochen-Specker Theorem is due to Kochen & Specker (1967); the authors 
were originally logicians. A similar but less precise statement had appeared earlier 
in Bell (1966), who was not cited by Kochen and Specker; some authors refer to 
the Bell-Kochen-Specker Theorem. The Nature assumption has been experimen¬ 
tally verified, cf. Huang et al (2003). The proof of the fundamental Lemma 6.7 we 
present is essentially due to Kochen and Specker, as simplified by Peres (1995). Our 
independent proof for is taken from Cabello et al (1996). Surveys of various 
proofs are given by Brown (1992) and Gould (2009); see also Waegell & Aravind 
(2012) and references therein, as well as Bub (1997) for another proof. From the 
Netherlands, we cannot fail to mention the short proof by Gill & Keane (1996). For 
geometric aspects (and even a link with M.C. Escher) see Zimba & Penrose (1993). 

One finds two opposite directions of research around the Kochen-Specker The¬ 
orem. A computational one, which seems hardly relevant to conceptual issues in 
physics (the goal rather being The Guinness Book of Records), consists of attempts 
to find a minimal set of vectors that cannot be coloured. See, for example, Pavicic 
et al (2005) for arbitrary dimension and Arends (2009) and Uijlen & Westerbaan 
(2015) for the latter paper showing that at least 22 vectors are needed. 

The other, which is of significant conceptual importance and hence is worth some 
more extensive discussion, consists of attempts to find a maximal set of vectors that 
can be coloured. That is, one looks for large (preferably dense and measurable) 
subsets of for which there exists a function V : ^ {0,1} that satisfies: 

• y(_u) =y(u) foreachu G 5^; 

• V(ui) + V(u 2 ) -|-y(u 3 ) = 2, for each (orthonormal) basis (ui,U 2 ,U 3 ) of 

whose elements lie in S^. 
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The first result in this direction was obtained by Meyer (1999) and Havlicek et al 
(2001), who showed that one may take = 5^ flQ^; this choice was motivated by 
invoking finite precision arguments to circumvent the Kochen-Specker Theorem, 
see below. To write down a suitable function V : S^nQ^ —>■ {0,1}, we first define 
an auxiliary function 5 : 5^ fl —>■ Z by 

5 f Z!i 

\mi ’ m2 ’ m3 

where 1cm is the least common multiple and gcd is the greatest common divisor of 
the argument. This function is obviously well defined. Then the following works: 

V{x,y,z) = 0 if S{x,y,z) is odd; (6.228) 

V{x,y,z) = 1 if S{x,y,z) is even. (6.229) 

More generally, for an arbitrary n-dimensional) Hilbert space H, with n < 0 °, 
Clifton & Kent (2000) proved the existence of a countable dense colorable subset 
of (cf. Definition 6.9), with the additional property that different 

resolutions of the identity drawn from (7/)^ never share a projection (so that the 
key strategy proof of Lemma 6.7, which is based on the existence of overlapping 
bases, falls apart). Given some enumeration ... of the countable set of 

all resolutions of the identity drawn from so that each {e^i\. ■ is a 

basis of H, k G N, each possible coloring W = Wf bijectively corresponds to some 
function / :N— >■{!,...,«} through 


n3 Icm(mi,m2,m3) 
m 3 gcd(ni,n2,n3) 


(6.227) 


Wf{e) = 1 if e = efly (6.230) 

Wf{e) = 0 otherwise. (6.231) 


Note that because of the total incompatibility of the projections, each e G {H)c 
belongs to a unique resolution {ef^), so that Wf is well defined. The statistical pre¬ 
dictions of quantum mechanics may then be recovered as follows. For each density 
operator p G f^{H) we may define a probability measure Pp on the set of all 
functions / :N— >■{!,...,«} by imposing the conditions 


Mp ({/ G I Wf{ef^) = A/'V/ = 1 ,... ,n,k G K}) 



= A, 


Wi 


(6.232) 

where G {0,1}, K C N is finite, and [ef'^ = Xj'^'^] is the projection onto the cor¬ 
responding eigenspace H^{k) of the projection ef '^ (more generally, for a G B(7/)sa 


we write [a = X] for the spectral projection ex defined by a and X G (7(a)). The 
subset of n^ in the argument of Pp is hereby declared measurable; existence and 
uniqueness of the measure Pp on a suitable ( 7 -algebra follow from the Kolmogorov 
extension theorem of measure theory, which applies because the marginals (6.232) 
satisfy the appropriate consistency conditions, cf. Hermens (2009) for details. 
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This formula guarantees that the left-hand side vanishes if = 0 for each i, 
and also if X^^ = 1 for more than one value of i. If /f = {feo} is a singleton and 
X = {X\,.. -jXn), then the right-hand side (and hence the left-hand side) is the Born 
probability for the outcome = A, for each i, i.e., 





(6.233) 


Consequently, it is true by construction that for any admissible measurement in 
quantum mechanics (in that all observables commute), i.e., for each ko G N, av¬ 
eraging over the ‘hidden variable’ f G reproduces the statistical predictions of 
quantum mechanics. This success is achieved at a high cost, however: 

(it) (k') 

• Two random variables e] and e), are statistically independent (with respect to 

lip) whenever k ^ k', even though \\ef^ — ^ || may be arbitrarily small. 

• For each f G the associated coloring Wf is maximally discontinuous, in that 
for each u G 1^\ {H)c and each e > 0 there is u' G {H)c such that although 
Iku — ^u'll < £ has Wf{ea) ^ lT/'(eu/), so that in fact |lT/(eu) — ^/(^uOI = 1- 

These facts were noted by Clifton & Kent themselves, and Appleby (2005) proved 
that they are a necessary feature of all constructions that involve sufficiently large 
subsets of d^\ (H) that can be colored. 

Without challenging their mathematical significance, these discontinuities un¬ 
dermine any potential physical relevance such models might have, and this in turn 
challenges the reason such models were introduced in the first place (Meyer, 1999), 
namely the (alleged) precision loophole of the Kochen-Specker Theorem. 

The thrust of this loophole is that it would be an illusion for an experimentalist 
like Alice to claim that she measures some observable a with infinite accuracy; 
in fact, given e > 0 she might equally well measure some a' with ||fl — fl'|| < e. 
Consequently, finding a dense colorable subset 3^\ {H)c C 1^\ (H) should suffice 
for a hidden variable interpretation of quantum mechanics, since if Alice believes 
she measures some projection e, the model assigns a value W{e') to the projection 
e' G {H)c she actually measures (where e' is selected by some algorithm that 
is part of the theory itself, cf. Clifton & Kent (2000)), and presents that value to 
Alice as the outcome of her measurement. However, owing to the discontinuities 
just mentioned, this value is as arbitrary as the identification of e'. 

As emphasized by Barrett & Kent (2004), this arbitrariness, although perhaps 
undesirable, does not by itself affect the ability of the Clifton-Kent model to repro¬ 
duce the statistical predictions of quantum mechanics. On the other hand, it would be 
pretty awkward to have a theory whose individual value attributions are completely 
arbitrary, especially since the finite precision argument is predicated on the idea that 
observables close to the one Alice believes herself to measure (i.e., e) should have 
approximately the same value as the one she actually does measure (namely, e'). 
If this is not the case, her measurements are pointless and the hidden variable Wf 
would be empirically inaccessible and hence truly “hidden” (Appleby, 2005). 
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See also Hermens (2009, 2016). This last point applies to Corollary 6.12, which 
would no longer be true if the set 2(4 of all bases of in Definition 6.11 would be 
replaced by some subset C drawn from a colorable subset of S^. Each z G 
Xz would then correspond to some coloring u E(u,z) of S^, which, by the above 
discussion, would be maximally discontinuous and hence empirically inaccessible. 
Nonetheless, such a theory does exist in principle. 

The aim of maximizing colorable sets was pursued in a different direction by Bub 
& Clifton (1996); see also Bub (1997). Given a “preferred” observable a G B(i/)sa 
and a pure state e € (//), these authors look for a maximal sublattice ^(e,a) of 

3^{H) that contains all spectral projections of a (but, despite the notation 3^{e,a), 
does not necessarily contain el), admits sufficiently many lattice homomorphism 
h : ^{e,a) -G {0,1} (i.e., binary valuations) such that the Born measure on 
(7(fl), i.e., /4e(2i) = Tr(ee 4 ), A C a{a), can be reproduced by averaging over these 
homomorphisms, and finally is invariant under all unitary isomorphisms of ^{H) 
that commute with both e and a. Equivalently, one wants a maximal C*-subalgebra 
A{a,e) of B{H) that contains a, admits sufficiently many dispersion-free states so as 
to reproduce the Born probabilities defined by a in the given state e, and is invariant 
in the said way (a fourth condition used by Bub and Clifton is superfluous; see Bub, 
1997, p. 128). Asuming for simplicity that n = dim(//) < 00 , the answer is 

A{a,e) = C*{exeex,X G cy{a))' (6.234) 

where, as always, ex is the projection into the eigenspace Hx for X G O'(a), and the 
prime denotes the commutant (one might as well take the commutant of the set of all 
exeex)- Equivalently, putting e = e-^= | v/)(v/|, eq. (6.234) is the C*-algebra gener¬ 
ated by all projections fx onto the nonzero components exW'^f ¥ iti each Hx and all 
one-dimensional projections that are orthogonal to all fx (given that dim(//) < 00 , 
this is the same as the linear span of these projections). Thus A (a, e) always contains 
C*{a), since it contains each ex, X G O’(fl)), but note that A(a,e) need not be com¬ 
mutative. In comparison, if the requirement had been the reproduction of all Born 
probabilities for arbitrary pure states e rather than for some given e, the answer 
would have been any maximal abelian C*-algebra in B{H) that contains C*(a); if a 
has non-degenerate spectrum, this is just C*{a) itself. The simplest possibility is 

A{lH,e)=C*iey = {ey, (6.235) 

which is the linear span of all projections / G 3^{H) for which either e < / or 
e < 1// — / (i.e., if e = e^^, then either ¥ G fH or y/ G {fH)^). In other words, we 
have a G A(l//,e) iff ¥ is an eigenvector of a (i.e. the eigenvector-eigenvalue link). 

Each dispersion-free state on A(a,e), or, equivalently, each homomorphism hx ; 
tf^{e,a) -G { 0 , 1 }, corresponds to one of the projections fx through hx{fx) = 1 
and hx{f) = 0 for all other one-dimensional projections / in jf^{e,a). The Born 
probabilities from e are then recovered by assigning (Born) measure Tr [efx )Xohx. 

Though interesting, this result mainly supports so-called modal interpretations of 
quantum mechanics, which we reject, since they tell us nothing physical about the 
measurement process and address the measurement problem only philosophically. 
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§6.2. The Free Will Theorem 

The Free Will Theorem was published in two versions by Conway & Kochen 
(2006, 2009). Analogous results had previously been published by Heywood (& 
Redhead (1983), Stairs (1983), Brown & Svetlichny (1990), and Clifton (1993), 
of which only the first paper was cited by Conway and Kochen. Moreover, the 
close relationship to Bell’s (1964) Theorem might well be insisted on as a topic that 
should have been discussed in the original papers. Other critical literature (making 
the points listed in the preamble to this chapter) includes Bassi & Ghirardi (2007), ‘t 
Hooft (2007), Goldstein et al (2010), Wuthrich (2011), Hemmick & Shakur (2012), 
Cator & Landsman (2014), Hermens (2014, 2015), and Walleczek (2016). 

The original (Strong) Free Will Theorem (fwt) states that three assumptions, 
called SPIN, TWIN, and MIN, imply that the response of a spin-one particle to the 
bipartite experiment with spin-one particles described above ‘is not a function of 
properties of that part of the universe that is earlier than this response (...).’ Here 
SPIN and TWIN are the first and second half of our Nature axiom, whilst MIN ex¬ 
presses a form of context-locality as well as the loose assumption that Alice and 
Bob may ‘freely choose’ their settings a and b, respectively. Accordingly, in our 
notation, Conway and Kochen only use the parameter space Z, rather than the full 
space X we need in order to consistently axiomatize determinism. Their formulation 
contains an implicit assumption of determinism, whose precise nature only becomes 
clear from their proof, and which is akin to our formulation, except for the crucial 
difference that the function they allude to only acts on the particle variables and not 
on the settings of the experiment (of which, as already noted, Conway and Kochen 
just say that the experimenters can ‘freely choose’ them). 

Conway and Kochen paraphrase their theorem as follows: 

‘if indeed we humans have free will, then elementary particles already have their own small 
share of this valuable commodity. More precisely, if the experimenter can freely choose 
the directions in which to orient his apparatus in a certain measurement, then the particles 
response (to be pedantic—the universe’s response near the particle) is not determined by the 
entire previous history of the universe. (...) our theorem asserts that if experimenters have 
a certain freedom, then particles have exactly the same kind of freedom. Indeed, it is natural 
to suppose that this latter freedom is the ultimate explanation of our own. (...) Granted our 
three axioms [i.e., the physical ones and freedom of choice], the Free Will Theorem shows 
that nature itself is nondeterministic.’ 

However, such far-reaching conclusions seem unwarranted by the actual technical 
content of the theorem. Indeed, though it is also assumed in Bell’s first theorem (see 
§6.5 below), the conjunction of Determinism and Freedom is a priori is uncomfort¬ 
able, especially since the main novelty of the FWT lies in the emphasis Conway and 
Kochen (unlike Bell) put on free will. The authors acknowledge at least this point 
already on the first page of their first paper (Conway & Kochen, 2006), in which 
they anticipate criticism of the kind: 

‘“I saw you put the fish in!” said a simpleton to an angler who had used a minnow to catch 
a bass.’ 

Indeed, also after more serious philosophical analysis, it has been concluded that: 
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‘Their [Conway & Kochen’s] case against determinism thus has all the virtues of theft over 
honest toil. It is truly indeterminism in, indeterminism out.’ (Wiithrich, 2011) 

Our formulation of the FWT, in which the original allusion to undefined free will in 
allowing arbitrary settings of the experiment has been replaced by complete deter¬ 
minism including the settings, avoids this criticism. 

To derive (6.35) - (6.38), we use (6.21) to write down the formulae 

Pyi,^{Fi = l,Gj = \\A = a,B = b) = (y/o, (I3 - |u;)(u,-|) (g) (I3 - |vy)(v^'|)y/o); 

P^\io{Fi = 0 ,Gj = 0 |A = a,B = b) = (y/o, |u,')(u;| 0 |v;)(v;|v^); 

Pxi,a{Fi=l,Gj = 0\A = a,B = b) = (1/0, (I3 - |u;)(u,'|) (g) |vj)(v^'|t/o); 

Pxi,o{Fi = 0,Gj = l\A = a,B = b) = (y/o, |u;)(u;| (g) (I3 - |vj)(vj|)v/o)- 

For example, for any pair of unit vectors u, v we have 

(V/o,|u)(u|(g)|v)(v|v/o) = 

i (e 1 (g) e 1 + 62 (g) 62 + 63 (g) 63, u I (g) I v) (VI (e 1 (g) e 1 + 62 (g) 62 + 63 (g) 63)) = 

i(6l (g)6l + 62(g) 62 + 63(g) 63, (u,v)u(g)v) 

= 

which gives (6.36). The other cases are similar. 

The implications of the finite precision loophole of the Kochen-Specker Theo¬ 
rem for the Free Will Theorem were analyzed by Hermens (2014), who concluded 
that this loophole does not apply. We give a more precise argument to this effect. 

We have dense colorable subsets C and Xg C X^ — Xy\, where may 
or may not coincide with Xg. If not, the perfect correlation condition (6.54) in the 
Nature assumption cannot even be stated, but even if X^ = X^, since finite precision 
of experiment has been declared to be an issue it would be quite out of character to 
impose (6.54). Instead, one needs a probabilistic version of this condition, of which 
it will turn out that it cannot be satisfied. As in the notes to the previous section, for 
each density matrix p one needs a probability measure Pp on Z that reproduces the 
statistical quantum-mechanical predictions for the associated quantum state. Com¬ 
pared to the notes to the previous section, the role of W is now played by z, in that 
for given F and G one might write 

W{a,b) = {F{a,z),G{b,z). (6.236) 

This measure may be constructed analogously to (6.232), i.e., for any sequence 
of bases drawn fvomX^, any sequence {b^^'^) of bases drawn from2f|, and any 
sequences and in A, cf. (6.22), where kG K cNis arbitrary, we define 

Ppi{zGZ\Fia^'^\z)=X , G{b^'^^, z) = ,kGK} = 

n Tr f P fl [4 = ■ 4 = rf ]) , (6.237) 

k^K \ ^7=1 / 
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where, as in the main text. 


a = (ui,U 2 ,U 3 ); (6.238) 

fo=(vi,V2,V3). (6.239) 

Note that 7^. acts on Alice’s Hilbert space whilst J^. acts on Bob’s. In particular, 
for fixed ko G K and X,Y G A, we have the special case of (6.237) for compatible 
measurements, viz. 

fip{{z G Z I =X,G{b^’^\z) = 7 } = Tr ( p fj ■ [4 = H 

\ f;=i 

where in the main text we would have written Pp{F = X ,G = pL\A = a,B = b) for the 
right-hand side. Hence for the correlated state p = | V7o)(v^o| we obtain from (6.42): 

p^,{{z G Z I Fi{a,z) ^ Gjib,z)}) = 1(1 - (u,-,V;)2), (6.240) 

which of course vanishes if u, = \j. If the expression 1 — (u, , V;)^ appearing here is 
small, then the projections and eyj are close (in norm), since 

lku,-evj2<2(l-(u,-,V;)2). (6.241) 

Eq. (6.240) therefore allows us to make rigorous sense of Hermens’ (2014) heuristic 
idea that the assumption (6.54) in the FWT should be modified as follows: 

‘if ||eu; — eyjW is small, then in most of the cases Fi{a,z) = Gj{b,z)' 

Namely, we replace (6.54) by the following approximate correlation condition: 

• For every e > 0 there is 5 > 0 such that if 1 — (u,,Vj)^ < 5, then 

Py,o({z G Z I Fi{a,z) ^ Gj{b,z)}) < e. (6.242) 

Indeed, if the theory existed, on could simply take 5 = e. However, a theory satis¬ 
fying (6.242) does not exist, as can be proved by contradiction: if Fi{a,z) = Gj{b,z) 
for all pairs (u,,v,) such that 1 — (u,,Vj)^ < e, then the proof of Theorem 6.13 
shows not only that (6.32) still holds on the modified Nature assumption (so that 
F{-,z) again defines a coloring of S^), but that in addition we have 

l-(u,u')^<5 E'(u,z) =E'(u',z). (6.243) 

In particular, the apparently weaker correlation condition ending with (6.242) is 
actually stronger than its exact counterpart (6.54). 

Thus Theorem 6.13 still holds on this revised Nature assumption, so that unlike 
the Kochen-Specker Theorem, the Free Will Theorem is immune to the finite pre¬ 
cision loophole. The price for this immunity is that, quite against the spirit of the 
FWT, some probabilistic reasoning had to be invoked, so that the difference between 
the FWT and Bell’s first theorem has blurred even further. 
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§6.3. Philosophical intermezzo: Free will in the Free Will Theorem 

The literature on free will is immense. Introductory accounts include Walter 
(2001), which focuses on the connection with neuroscience, Doyle (2011), and 
Beebee (2013), the second of which remains largely philosophical, the third even 
completely. A very sophisticated recent defense of compatibilism is Ismael (2016). 
Lewis’s ‘local miracle compatibilism’ was proposed in Lewis (1981). What’s more: 

‘[Lewis’s paper is] the finest essay that has ever been written in defense of compatibilism— 
possibly the finest essay that has ever been written about any aspect of the free will problem.’ 
(van Inwagen, 2008). 

Saunders (1968) already made a point similar to Lewis’s; see also Moore (1912, Ch. 
6 ). For Lewis’s theory of counterfactuals see Lewis (1973, 1979, 2000), as well as 
Menzies (2014). See also Fischer (1994), Beebee (2003,2013), and Vihvelin (2013). 

Although Lewis’s position is called local miracle compatbilism, a miracle takes 
place neither in the actual world where Alice’s hand is at rest nor in the possible 
world where she raises it, i.e., a law is broken neither in the former nor in the latter: 

‘This is what Lewis means by a ‘miracle’: an event M is a miracle if and only if M occurs 
at possible world w, and M is contrary to some actual law (or combination of laws) L. The 
point here is that while M is a miracle in Lewis’s sense, it is not contrary to any of w’s laws 
of nature. At w, L simply isn’t a law in the first place. So, as things actually happened— 
in the actual world— L is a law, and m does not occur, so there is no miracle in the usual 
sense of ‘miracle’, m is only a ‘miracle’ in Lewis’s special sense of ‘miracle’: something 
(m) happens in w that is contrary to the laws of nature in the actual world.’ 

(Beebee, 2013, p. 62) 

Unfortunately, confusion may arise if the quotation in the main text ‘if I did it, a law 
would be broken’ from Lewis (1981) is subjected to the following explanation: 

‘On Lewis’s account of counterfactuals, the truth conditions for counterfactuals—what 
makes them true—are as follows. Suppose we have the counterfactual ‘if A had been the 
case, B would have been the case’ (so if A is ‘I miss the bus’ and B is ‘I’m late’, this coun¬ 
terfactual just says, ‘if I’d missed the bus, I would have been late’). This counterfactual will 
be true if and only if, at the closest possible world to the actual world at which A is true, B 
is also true. So, our sample counterfactual, ‘if I’d missed the bus, I would have been late’, 
is true if and only if: at the closest possible world to the actual world at which I miss the 
bus. I’m late.’ (Beebee, 2013, p. 60). 

Removing any possible remaining doubt, on p. 62 she mentions that the closest 
possible world where I miss the bus is the world w. According to this explanation, 
then, Lewis’s sentence ‘if I did it, a law would be broken’, would mean that at the 
closest possible world to the actual world in which I did it, a law is broken, i.e., in w. 
But according to Beebee’s definition quoted in the main text of what Lewis means 
by a miracle, apparently this is not the right reading (and indeed it would, in our 
view, be nonsensical). Moreover, Lewis (1981) emphasizes that in the first bullet 
point in the main text above—which he defends—it is not the agent who would 
break a law, whereas in the second bullet point —rejected by Lewis—it is; in the 
first it is the breaking of some law at an earlier time that enables the agent to do 
what she, in our actual world, did not do. Thus Lewis’s phrasing seems awkward. 





Notes 


239 


Our development of Lewis’s argument is indebted to Vihvelin (2013, pp. 164- 
165), who (re)states Lewis’s first bullet point as the following conjunction: 

1 . Slightly Different Past: If I had raised my hand, the past would still have been 
exactly the same until shortly before the time of my decision. 

2. Slightly Different Laws: If I had raised my hand, the laws would have been ever 
so slightly different in a way that permitted a divergence from the lawful course 
of actual history shortly before the time of my decision. 

A second way in which Alice could (counterfactually) have raised here hand is 
through an instant (counterfactual) modification of the state of the world, as in Ben¬ 
nett (1984). This has been explicated by Vihvelin (2013, p. 165), too: 

1 . Same Laws: If I had raised my hand, the laws would still have been the same. 

2. Completely Different Past: If I had raised my hand, past history would have 
been different all the way back to the Big Bang. 

Here we prefer to write Different Past, since even though in this scenario the state 
indeed (by determinism) would have been different all the way back to the Big Bang, 
the entire trajectory of the world may or may not be close to the actual one. In this 
scenario, the two cases Lewis distinguishes take the form in the main text. 

Since the main novelty of their papers lies in the emphasis on free will, the reader 
might wonder what Conway & Kochen themselves have to say about the subject. As 
we can read in the delightful biography of Conway by Roberts (2015), or watch in 
his video lectures on the Free Will Theorem (Conway, 2009), free will is indeed of 
great importance to at least the first author of the theorem. Unfortunately, his interest 
in free will seems unaccompanied by any philosophical sophistication, e.g.: 

‘Compatibilism in my view is silly. Sorry, I shouldn’t just say straight off that it is silly. 
Compatibilism is an old viewpoint from previous centuries when philosophers were talking 
about free will. The were accustomed to physical theory being deterministic. And then 
there’s the question: How can we have free will in this deterministic universe? Well, they 
sat and thought for ages and ages and ages and read books on philosophy and God knows 
what and they came up with compatibilism, which was a tremendous wrenching effect to 
reconcile 2 things which seemed incompatible. And they said they were compatible after 
all. But nobody would ever have come up with compatibilism if they thought, as turns out 
to be the case, that science wasn’t deterministic. The whole business of compatibilism was 
to reconcile what science told you at the time, centuries ago down to 1 century ago: Science 
appeared to be totally deterministic, and how can we reconcile that with free will, which 
is not deterministic? So compatibilism, I see it as out of date, really. It’s doing something 
that doesn’t need to be done. However, compatibilism hasn’t gone out of date, certainly, 
as far as the philosophers are concerned. Lots of them are still very keen on it. How can 
I say it? If you do anything that seems impossible, you’re quite proud when you appear 
to have succeeded. And so really the philosophers don’t want to give up this notion of 
compatibilism because it seems to damned clever. But my view is it’s really nonsense. And 
it’s not necessary. So whether it actually is nonsense or not doesn’t matter.’ 

(Conway, quoted in Roberts, 2015, pp. 361-362). 

Finally, our version of van Inwagen’s (1975) Consequence Argument is due to 
Beebee (2003), and the novel parts of this section are based on Landsman (2016c). 
For interesting philosophical criticism of this approach, see De Mola (2016). 
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§6.4. Technical intermezzo: The GHZ-Theorem 

The GHZ Theorem appeared in Greenberger et al (1990) See also Clifton, Red¬ 
head, & Butterfield (1991) and Bub (1997). Innumerable variations on and gen¬ 
eralizations of such arguments may be given, leading to equally many Free Will 
Theorems. All of these have their roots in algebraic properties of matrices, which 
hidden variable theories (in vain) try to reproduce. 

§6.5. Bell’s theorems 

The original contributions to the theme of this section are Bell (1964, 1976), of 
which the first is one of the most famous papers of 20th century theoretical physics. 
Since there are more than 10,000 papers citing Bell (1964) alone, it is impossible 
to discuss all literature relevant to Bell’s work. What we call his first theorem orig¬ 
inates with Bell (1964), which incidentally was written after Bell (1966), but our 
treatment of the settings (taken from Cator & Landsman, 2014) is different. Though 
originally motivated as an attempt to make the Free Will Theorem look less of a pe- 
titio principii, it also addresses a problem Bell faced even according to some of his 
staunchest supporters (Norsen, 2009; Seevinck & Uffink, 2011), namely the tension 
between the idea that the hidden variables (in the pertinent causal past) should on 
the one hand include all ontological information relevant to the experiment, but on 
the other hand should leave Alice and Bob free to choose any settings they like. 

His second theorem comes from Bell (1976), followed by Bell (1990a). 

Apart from his own papers, which are reprinted in Bell, Gottfried & Veltman (2001), 
treatments of Bell’s Theorems we regard as sound include Fine (1982), Jarrett 
(1984), Pitowsky (1989), van Fraassen (1991), Butterfield (1992a,b), Bub (1997), 
Werner, & Wolf (2001), Liang, Spekkens, & Wiseman (2011), Shimony (2013), 
Wiseman (2014), and Brown & Timpson (2015). Recent and mathematically inno¬ 
vative approaches include Abramsky & Brandenburger (2011), Acm et al (2015), 
and Fritz (2016). For history, see Gilder (2008) and Kaiser (2010). 

Unfortunately, we have not been able to come to grips with (and hence do not 
cite) literature claiming that Bell’s theorems are false, or have nothing to do with 
hidden variables, or prove that quantum mechanics (if not nature itself!) is nonlocal 
per se, or that he never changed his mind and only has one theorem saying it all. 

The verification of (6.102) - (6.105) is analogous to the above computations de¬ 
riving (6.35) - (6.38). In terms of the unit vector 

f cosa\ 

"«=(sinflj’ ^^■244) 

the observable F Alice measures on setting A = a is the projection ea = |va)(va|, 
and similarly for Bob. Hence the corresponding Bom probabilities are given by 

Py/oiF = l\A = a,B = b) = {Wo,ea‘^eb\j/o); 

PvoiF = 0,G = 0|A = a,B = b) = (t/o, (I 2 — Sa) 0 (I 2 — eb)Wo)', 

PxffoiF = l,G = 0\A = a,B = b) = {Wo,ea‘^il2-eb)Wo)'-> 

PvoiiF = ^,G= l\A = a,B = b) = {\l/o,{h-ea)^ebWo)- 
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For example, we have 

{WO,ea®ebYo) = ^ei Od+62 062, |va)(va| 0 |v*)(v*|(ei Od+62(862)) 

= j(6i 06l +62062, (cOSflCOSfo + sinasinfo)Va0Vfo) 

= 5(cosflcosfo + sinasinfo)^ 

= jCos^{a — b). 

The CHSH-inequality (6.117) is due to Clauser, Home, Shimony, & Holt (1969). 
The definitive (i.e., loophole-free) experimental verification of its violation in nature 
is Henson et al. (2015). A direct proof starts of (6.117) from the simpler inequality 

P{F ^ H) < P{F ^ G) + P{G ^ H), (6.245) 

for three {0, l}-valued random variables F,G,H, which implies (6.117). To prove 
(6.245), one just writes 

P{F^H) = P{F =1,G=1,H = 0)+P{F = 1,G = 0,H ^0) 

+ p{F = 0,G=1,H=1)+P{F = 0,G = 0,H =1), 

etc., and notes that each term on the left-hand side of (6.245) also occurs on the right- 
hand side. Since each term lies in [0,1] and hence is positive, this implies (6.245). 
Our proof of Proposition 6.17 follows Werner & Wolf (2001), as does our proof of 
Theorem 6.18 (though not our formulation thereof, which once again derives from 
Cator & Landsman (2014). This proof shows that, as first noted by Fine (1982) and 
analyzed more deeply in Butterfield (1992b), there is no real distinction between 
the possibility of reproducing given (empirical) probabilities P{F = X,G = y\A = 
a,B — b) that satisfy Bell locality by a local deterministic hidden variable theory or 
by a local stochastic hidden variable theory. Most current research in this direction, 
sparked by Popescu & Rohlich (1994), is therefore concerned with theories defined 
by formal joint conditional probabilities that satisfy a no signaling condition like 01 
instead of Bell locality, cf. Bub (201 lb) and Brunner et al (2014) for reviews. 

Formal conditional probabilities of the kind that Bell’s second theorem uses have 
been axiomatized by e.g. Popper (1938) and Renyi (1955); the following axioms are 
theorems if conditional probabilities are defined a la Kolmogorov by (1.1). Let Z be 
some (7-algebra and let ^ C 2;\{0} be an ideal in Z in the sense that if B G Z and 
C G then B fl C G A conditional probability on (2^, is a map 

[0,1]; (6.246) 

(A,C) h^B(A|C), (6.247) 


such that: 

1. For each C G the map A 1-7 P{A |C) is a probability measure on Z ; 

2. B(AnB|C) =B(A|BnC)-B(B|C), foreachA,B G Z and C G 
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Van Fraassen (1991) noted that if (6.121) holds, then the variable x is a common 
cause in the sense of Reichenbach for Alice’s and Bob’s outcomes (see Hofer-Szabo 
(2015) for a recent paper in this direction). To explain this observation, suppose two 
random processes F and G (like Alice’s and Bob’s measurements) are correlated, 

i.e., P{F = X,G = y) ^ PiF = X)P{G = 7). What might cause the correlation? 

1. Chance. If Alice and Bob independently throw dice but always get the same 
result, there is a computable nonzero probability for this to happen without any 
reason. But this probability decreases as the number of occurrences grows. 

2. Causation. One outcome influences or even determines the other. Maybe Bob, 
whose experiment is genuinely random, is able to manipulate Alice’s experiment 
once he has seen his outcome. But according to relativity theory or other basic 
notions of causality in space-time, this should be impossible if Alice and Bob 
perform their measurements simultaneously and far from each other. 

3. Ur-determinism. The initial conditions at the Big Bang plus deterministic Laws 
of Nature imply the correlation. However, physics becomes pointless if we en¬ 
dorse this option. The notion of explanation as the purpose of science is defeated 
and there is little difference between this argument and Divine Predestination. 

4. Identity. The motions of my mirror image are strongly correlated with me, but 
that is because this image is really the same as me (at least in so far as motion is 
concerned, as opposed to e.g. thoughts). This example might also be explained 
using causation. Another example consists of Alice and Bob filming the same 
random process (which may also be explained using the following concept). 

5. Common Cause A random process X is said to be a common cause for two 
correlated random processes if it precedes both and satisfies 

P{F = X,G = y\X =x)= P{F = A |V = x)P{G = y\X = x). (6.248) 

Another way to write this is P{F = A|G = 7 ,V = x) = P{F = X\X =x), which 
shows that a common cause X screens off the dependence of F on G. Often the 
common cause is hidden and has to be inferred from the observed correlation 
(having excluded other explanations, like the ones above). A nice example of 
this is the inference of a manuscript called Q in New Testament studies. It is 
clear that the Gospels of Matthew and Luke both draw on Mark, but they also 
contain strikingly similar or even identical non-Markan passages. For various 
reasons it is unlikely that either one copied these from the other, so that the main 
hypothesis is that they both rely on Q, which is now lost. See e.g. Mack (1993). 

From this perspective, the amazing fact is that the correlations in the Alice and 
Bob experiment with either spin-1 particle or photons cannot be explained by a 
common cause, since its existence (in the form of x) would imply the Bell inequality. 
However, of the four other explanations described above, no. 1 is ridiculous given 
the statistics of the relevant experiments, no. 2 is at odds with relativity, and no. 
4 seems inapplicable. This leaves no. 3, which seems only supported by’t Hooft 
(2016), who denies the independence assumptions (i.e. between the settings and the 
state of the pair of particles undergoing measurement) lying at the basis of both the 
Free Will Theorem and Bell’s theorems. Every way you look at it you lose! 
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Generalizations of Theorem 6.19 to operator algebras were given e.g. by Baez 
(1987), Raggio (1988), Werner (1989), and Bacciagaluppi (1993), as follows. Let A 
and B be unital C*-algebras, with projective tensor product A®B (i.e., the comple¬ 
tion of the algebraic tensor product A® B in the maximal C*-cross-norm), cf. §C.13; 
the choice of the projective tensor product guarantees that each state on A 05 ex¬ 
tends to a state on A(t)B by continuity; conversely, since A 0 B is dense in A®B, each 
state on the latter is uniquely determined by its values on the former. In particular, 
product states p 0 O’ and mixtures (0 ~ p,p,' 0 O, thereof are well defined on A(§)B. 

If A C B{H\) and B C B{H2) are von Neumann algebras, and all states considered 
are normal, it is easier to work with the spatial tensor product A(g)B, defined as the 
double commutant (or weak completion) of A 0 B in B{Hi ®H 2 ). Any normal state 
on A 0 B extends to a normal state on A(8)B by continuity. Below we use 0, but the 
results also work for 0. In what follows, A and B are unital C*-algebras. 

Definition 6.23. Let co be a state on A(§)B. 

1. A product state is a state of the form (O — p ® O, i.e., CO is defined by linear (and 
continuous) extension of (o(a®b) = p{a)cy{b). 

2. A state (0 is uncorrelated when it is in the w* -closure of the convex hull of the 
product states on A®B. In particular, states CO = f^iPiPi ® ^i> where pi > 0 and 
Y,iPi = 1. cire uncorrelated (w*-convergent infinite sums are allowed here). 

3. A state is correlated when it is not uncorrelated. 

An uncorrelated state co is pure precisely when it is a product of pure states. This 
has the important consequence that both its restrictions CO^a to A and B, 

respectively, are pure as well (the restriction a)|^ of a state co on A(§)B to, say, A is 
given by G)|^(a) = Co{a 0 1b), where 1b is the unit element of B, etc.). A correlated 
pure state has the property that its restriction to A or B is mixed. 

Proposition 6.24. The following conditions are equivalent: 

• Each state on A®B is uncorrelated; 

• Each pure state on Ai^B is a product state; 

• At least one of the C*-algebras A and B is commutative. 

For the proof see Takesaki (2002), Theorem 4.14. 

Corollary 6.25. Correlated states exist iff A and B are both noncommutative. 

As one might expect, this result is closely related to the Bell inequalities: 
Proposition 6.26. For any co G B(A(§)B), the following conditions are equivalent; 

• CO is uncorrelated. 

• For all self-adjoint operators fli, fl 2 G A and bi,b 2 G B of norm < 1 we have 

\co{ai{bi+b2)+a2{bi-b2))\<2. (6.249) 

See Baez (1987), Raggio (1988), Bacciagaluppi (1993), and Landsman (2006a). 
Corollary 6.27. If A or B is commutative, then (6.249) holds for all states CO. 
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An elegant geometric approach to the Bell inequalities was developed by Pitowsky 
(1989, 1994), which we now summarize (also cf. Werner & Wolf, 2001). 

Suppose we have a bipartite experiment with m different settings A = a\,.. Mm 
and B — bi,... ,bm on each wing, and binary outcomes, i.e., in {0,1}. We now de¬ 
note the probability P{F = 1 |A = a,) that F(a,) (i.e. the particular property measured 
by experiment F at setting af) is true by p,- (i = 1,..., m), and likewise we write pj+m 
for P{G\B — bj), i.e., the probability that G{bj) is true, once again for j = 1,... ,m. 
Furthermore, we abbreviate the probability that F{ai) and G{bj) are both true by 

Pij+m=P{F =l,G=l\A = ai,B = bj) (6.250) 

The 2m+m^ “surface probabilities” p = (pi,.. ■,p2m,Pi,m+i,- ■ ■ ,Pm,2m) form a vec¬ 
tor in K 2 m+m 2 ^ which we wish to constrain by the following assumption: there 
is a fact of the matter underlying each experiment according to which the pair 
{F{ai),G{bj)) already had a truth value for each possible setting {ai,bj), indepen¬ 
dently of any measurement being carried out or not {“local realism”). Thus the 
probabilities p (which now arguably have an ignorance interpretation) must lie in 
the convex polytope in I defined as the convex hull Cm of the following set 

of (extreme) points: for each 2m-tuple A = (Ai,.. ■,X 2 m), where A,- G (0,1}, define 

= (Al,...,A2m,Ai •Am+l,...,Am-A2m) G , (6.251) 

i.e., the entry at place k is Ak (b — 1,..., 2m) and the entry at place (i, j) is A,- • Am+j, 
where i,j = 1,..., m. The interpretation of this is that represents the particular 
fact of the matter where F{ai) has truth value A, and G{bj) has truth value Am+p 
so that their conjunction {F{ai),G{bj)) has truth value A, ■ Xm+j- In this state the 
probability of the said configuration is one and all other states have probability zero; 
arbitrary probability assignments then lie in Cm- The point, then, is to characterize 
the convex polytope Cm C tjjj-Qugh a finite set of inequalities, which turn 

out to be generalized Bell inequalities. Seeing this result requires some background. 

Let y be a real topological vector space with (continuous) dual V*; if V = K" we 
may also put V* = K" and write (p{v) as an inner product {(p,v) in what follows. 

1. Any (not necessarily convex) subset S CV has a polar S° C V* defined by 

S" = {(pGV* \(p{v)<iyvGS}, (6.252) 

which is a closed convex subset of V*. If 5 = /T is a compact convex set, we have 
K" = {(pGV* \(p{v)<iyvG deK}. (6.253) 

2. The bipolar theorem (cf. e.g. Simon (2011, Theorem 5.5) states that 

5"" =co(5U{0}). (6.254) 

In particular, if /f a closed convex set containing the origin, then 
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K"" = K, (6.255) 

and hence, if K° is a compact convex set, we may reconstruct K from as 

K = {vGV \(p{v)<iy(pGdeK‘’}. (6.256) 

3. In particular, if /f is a convex polytope in a finite-dimensional vector space con¬ 
taining the origin, then so is K°. In that case, d^K'^ is a finite set and so points in K 
are characterized by a finite set of linear inequalities (6.256), which describe the 
faces of the polytope. In this case, the associated (dual) description of K is called 
the Minkowski-Weyl Theorem, see e.g. Paffenholz (2010) for applications. 

For example, among the five Platonic solids (i.e. in M^) the cube and the octahedron 
are dual to each other, as are the dodecahedron and the icosahedron, whereas the 
terahedron is self-dual. A propos, the latter arises as the convex polytope Ci for 
m = 1 in the above story: clearly 2m + m?' = 3, and for the vertices of Ci one takes 
the four points x;^ ensuing from the four possibilities X = (0,0), (1,0), (0,1), (1,1), 
i.e., x;l = (0)0)0)) (1)0,0), (0,1,0), (1,1,1). Then the inequalities in (6.256) are 

P1,2>0, pi > pi,2, P2>P1,2, Pi+P2-pi,2 < 1- (6.257) 

For m = 2 the ensuing convex polytope C 2 C is the convex hull of 16 extreme 
points, whose inequalities may be found in Pitowsky (1989, p. 27); these imply the 
CHSH inequality, whose violation in quantum mechanics therefore shows that the 
probabilities in question have no local realistic model. 

More generally, suppose we have n yes-no experiments (£ 1 ,...,£„) and some 
subset Sn of the set {{i,k) \ 1 < i < k < n} (above we had n = 2m, Ei = F{ai) for 
i= I,,m, Em+j = G{b j) fox j= \,...,m, and5„ = {(/,m-|- j) \ 1 < < m}). This 

gives surface probabilities (pi,... TPmPi,k), where {i,k) G S„), which form a vector 
p in As in (6.251), each truth assignment X = {Xi,... ,A„), X G {0,1}, then 

defines a point x;^ g with coordinates (Ai,... ,A„,A,-• A^,), where once again 

(i,k) G Sn. This set of 2” points in turn spans a convex polytope characterized 
by inequalities following from the dual characterization (6.256). Classical thinking 
would constrain the p so as to lie in Cs„, and indeed we have p G Cs„ iff there is a 
probability space (2f,G,jU) such that p,- = /4(A,) and p,i, = /r(A, nA<.) for certain 
events A, G E, cf. Theorem 2.3 in Pitowsky (1989), which is based on Fine (1982). 

Some authors claim on this basis that Bell-type inequalities have nothing to do 
with physics, but surely the point is that some physical assumptions (notably local 
realism) have to be made in order to justify the “classical thinking” behind €$„. 
§6.6. The Colbeck-Renner Theorem 

This section is based on Colbeck & Renner (2011, 2012a, 2012b), where the 
main idea originates (alas with unclear assumptions and at best heuristic “proofs”), 
Braunstein & Caves (1990), who provided steps 1 and 2 of the proof, and Landsman 
(2015), whom we follow closely. See also Leegwater (2016) for a technically dif¬ 
ferent approach (by a far more complicated argument, Leegwater seems to manage 
to do without our CP assumption, i.e., continuity of probabilities). 





Chapter 7 

Limits: Small h 


Limits are essential to the asymptotic Bohrification program. It was recognized at 
an early stage in the development of quantum mechanics that the limit h ^ 0 of 
Planck’s constant going to zero should play a role in the derivation of classical 
physics from quantum theory, and later on also the thermodynamic limit (which 
often means “lim^^^.^”, where N is the number of particles in the system) became a 
subject of interest in quantum statistical mechanics. The conceptual status of these 
limits will be discussed in Chapter 10; in the present one we mainly explain the 
underlying mathematics. However, one question needs to be addressed immediately, 
since it is a source of much confusion. Varying N seems a realistic thing to do in the 
lab or on paper, whereas his a constant, so how can it be varied? The answer is that 
h is a dimensionful constant, from which one forms dimensionless combinations 
of h and other parameters; this combination then re-enters the theory as if it were a 
dimensionless version of h that can indeed be varied. The oldest example is Planck’s 
radiation formula Ey/Ny = — 1), with temperature T as the pertinent 

variable. Indeed, the observation of Einstein and Planck that in the limit hv/kT 0 

this formula converges to the classical equipartition law Ey /Ny = kT may well be 
the first use of the h—^Q limit of quantum theory; note that Einstein put hv/kT 0 

by letting v —0 at fixed T and h, whereas Planck took T ^ ai fixed v and h\ 
Another example is the Hamiltonian h = — -\-V{x) in the Schrodinger equa¬ 

tion of non-relativistic quantum mechanics, where m is the mass of the pertinent 
particle. Here one may pass to dimensionless parameters by introducing an energy 
scale e typical of H, like e = sup^ as well as a typical length scale f, such 

as f = e/sup^ |Vy (x)| (if these quantities are finite). In terms of the dimensionless 
variable x = x/i, the rescaled Hamiltonian h = h/eis then dimensionless and equal 
toh = —h A+ V{x), where h = h/ls/lmt, the operator A is the Laplacian forx, and 
Vix) =V (£x)/e. Here h is dimensionless, and one might study the regime where it 
is small. Similarly, it is often realistic to rescale the potential V by a positive number 

X, in which case hx = — +XV{x) can be rescaled to hx/X = —^A +V{x), 
with h = h/VX, so that the “large V limit’’ X ^ comes down to ^ 0. 
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In (older) textbooks on quantum mechanics the limit /i —0 is typically studied 
using the so-called WKB-approximation. This may be justified on historical grounds, 
but in fact this approximation is rarely applicable, and is extremely delicate even 
when it applies. Fortunately, a much more satisfactory and almost universally appli¬ 
cable framework has become available since the 1990s, namely (strict) deformation 
quantization, where the word “strict” (which we will henceforth omit) refers to the 
fact that in this approach is a real number that can “really” (!) be varied and hence 
can be made small (as opposed to formal deformation quantization, where is a for¬ 
mal parameter having no actual value). Also, “strict” sometimes refers to the use of 
C*-algebras and the high mathematical standards this brings. In the formalism that 
follows, (deformation) quantization and the classical limit of quantum mechanics 
are seen as two sides of the same coin, as the axioms of quantization are predicated 
on recovering the correct classical limit, while conversely the classical limit only 
makes sense in the context of some correct notion of quantization. 

The starting point of deformation quantization is a phase space X, mathemat¬ 
ically described as a Poisson manifold, i.e., a manifold equipped with a Poisson 
bracket {•,•} on its algebra of smooth functions C°°{X), see §3.2. We recall that 
a Poisson bracket is a Lie bracket on C°°{X) with the additional property that for 
each h € C°°(A), the map 5/,(/) = {h,f} is a derivation of C°°(X) with respect to its 
structure as a commutative algebra under pointwise multiplication, i.e., 

Wg) = f^hig) + Sh(f)g. (7.1) 

Furthermore, like pointwise multiplication, the Poisson bracket preserves real¬ 
valuedness, i.e., if / G C°°(A,]R) and g G C°°(A,]R), then also {f,g} G C“’(A,R). 

As early as 1925, Dirac noted the formal analogy between Poisson brackets 
of functions on phase space and commutators of operators on Hilbert space (i.e., 
\a,b] =ab — ba). Indeed, if A is any C*-algebra, the commutator is a Lie bracket on 
A, and if we use \a,b]' = i[ab — ba\, then also self-adjointness is preserved (in that 
a* = a and b* = b implies that also [a,b\' is self-adjoint, which fails to be the case 
for the commutator itself unless it vanishes). Thus is a Lie bracket on Asa. 

Moreover, if for fixed a G A we define 5a [b) = [a, b]', then we have the product rule 

5a {be) = 5a {b)c + b5a (c), (7.2) 

which makes 5^ : A —> A a derivation. A problem arises if one wishes to restrict 5a 
to Asa, since this subspace is not stable under multiplication. This may be remedied 
by passing to the Jordan product (5.14), i.e., aob = ^{ab + ba), which is defined on 
Asa. If a* = a, then 5a : Asa — ^sa satisfies the rule (7.2) also with respect to o. 

All this remains true if [—,—]' is rescaled by a nonzero real number. Which num¬ 
ber this should be was suggested by Schrodinger’s construction of momentum and 
position operators on the Hilbert space H = if (M) through the substitutions 

P'^P = -j-, (7.3) 

i ax 

q q=x, (7.4) 
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where “x” is the multiplication operator mid (with id(x) = x), i.e., q'^f{q) = x\j/{x); 
for the moment we will not be bothered by the fact that these operators are un¬ 
bounded; let us say they are both defined on the domain C“(K) C 

This yields the canonical commutation relations (which formally hold on C“(M)): 

'j\P,q] = U, (7.5) 

Noting the Poisson brackets (in which p,q are the coordinate functions on X = K^) 

{p,q} = ix, (7.6) 

it it clear that analogy should be between } and {i/'h)[—,—]. Thus Dirac wrote: 

‘The strong analogy between the quantum P.B. defined by [{i/ti) times the commutator] and 
the classical P.B. (...) leads us to make the assumption that the quantum P.B.’s, or at any 
rate the simpler ones of them, have the same values as the corresponding classical P.B.’s.’ 

Combined with Heisenberg’s decisive idea that quantum mechanics should be an 
Umdeutung (i.e., reinterpretation) of classical mechanics, one is led to the idea that 
“quantization” should be given by a linear map 

/^Gfi(/), (7.7) 

where / is some (smooth) function on phase space X and Qf, (/) is some operator 
on some “corresponding” Hilbert space, whose identification or construction is a 
separate problem (but forX = it should apparently be L^(]R)), such that 

'^[Qn{f),Qn{8)] = Qn{{f,g}), (7.8) 

at least for functions f,gGC°°{X) with ‘the simpler’ Poisson brackets. If only to do 
justice to Schrodinger’s example (7.3) - (7.4) with (7.5), one should also require 

Qh{lx) = lH- (7.9) 

The act of quantization should also preserve the adjoint, i.e., writing /*(x) = /(x), 

Qn{r) = Qn{fr- ( 7 . 10 ) 

Putting h on the right-hand side of eqs. (7.5) and (7.8), Dirac (and similarly the 
Dreimdnnerarbeit Born-Heisenberg-Jordan) concluded from these equations that: 

‘classical mechanics may be regarded as the limiting case of quantum mechanics when h 
tends to zero. ’ 

In the remainder of this chapter we try to do justice to this fabulous insight of Dirac’s 
(and also of Born, Heisenberg, and Jordan, or even Planck, Einstein, and Bohr, none 
of whom seem to have quite appreciated the stupendous complexity of the claim). 
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7.1 Deformation quantization 


Recall Definition C.121 of a continuous bundle of C*-algebras over some space I, 
which below is taken to be a subset of the unit interval [0,1] that contains 0 as an 
accumulation point (so one may have e.g. I = [0,1] itself, or / = (1 /N) U {0}). 

Definition 7.1. A deformation quantization of a Poisson manifold X consists of a 
continuous bundle of C*-algebras (A, : A —>■ A/jj/jg/) over I, along with maps 

Qn ■ Aq ^ Afi {h G I), (7.11) 

where Aq is a dense subspace of Aq = Co{X), such that: 

1. Qq is the inclusion map Aq ^ Aq; 

2. Each map Qn is linear and satisfies (7.10); 

3. For each f G Aq the following map is a continuous section of the bundle: 

0 ^ /; (7.12) 

n^Qnif) {h>Qy, (7.13) 

4. For all f,gGAo one has the Dirac-Groenewold-Rieffel condition 


lim 

n^o 


^mf),Qn{g)]-QH{{f,g}) 



(7.14) 


It follows from the definition of a continuous bundle that continuity properties like 


lim||eR(/)|| = ll/ll.; (7.15) 

ti^O 

lim\\Qn{f)Qnig)-Qnifg)\\=0, (7.16) 

are automatically satisfied. Let us note that condition (7.9) is absent from this defi¬ 
nition, because 1^ '^Cq{X) whenever X is not compact, in which case typically also 
the C*-algebras Af^ have no unit (see below). However, the given conditions turn out 
to be sufficiently powerful to produce the “right” examples. We give one of the main 
such examples without proof (the underlying analysis is quite forbidding). We put 

Aq = Cq{T*W); (7.17) 

As =Bo(i^(R")) (^>0), (7.18) 

where T*W = carries the canonical Poisson structure (3.34), and As is the C*- 
algebra of compact operators on the familiar Hilbert space L? (R”) of wave-functions 
on R”. For the sake of completeness we also mention that 

A = c;((R"xR")^) (7.19) 

is the (reduced) C*-algebra of the tangent groupoid (R” x R")^ to the pair groupoid 
R” X R" on R", see §§C.16,C.19, where one may also find the maps (pn. 
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Let us summarize the situation. Continuity of the limit h^Ois hard to envisage 
if one merely has the classical phase space X = r*K” and the quantum Hilbert 
space L^(K") in mind. However, the move to either: the underlying Lie groupoids 
TR" and M" X K", which jointly comprise the smooth tangent groupoid K” x M”)^, 
or: the corresponding canonically defined C*-algebras Co(7’*K”) and 
which are glued together as a continuous bundle (7.17) - (7.19), does give rise to a 
satisfactory structure that makes the limit h^Q “continuous”. 

In this example, various possibilities for the quantization maps Qp, arise. As ex¬ 
plained in §C.19, the groupoid structure underlying (7.17) - (7.18) suggests Weyl’s 
prescription (C.549), which for convenience we reproduce: 

Qf(f)¥(x)= [ + (7.20) 

(2nn)’' 

where / lies in the image of C/(7’K") under the fiberwise Fourier transform (C.547). 
This image, then, is the space Aq in Definition 7.1. We may rewrite (7.20) as 

Qnif)= [ {q,p), (7-21) 

Jt*w 

where the operators in the integrand are given by 

(?,p)r(x) = (7.22) 

The purpose of (7.21) is that for each \j/ G L^(K") we then obviously have 


(v.a (/)v-) = (P.9). 

where : T*®” —M is the Wigner function, given by 

Wp {p,q) = {q,p)w) 

= f d’'ve‘'’'’\lf{q+\'hv)\l/{q—\hv). 


(7.23) 


(7.24) 

(7.25) 


If II V/|| = 1, then gives a “phase space portrait” of the corresponding pure state 
Cy/ on Bo{L?'{R)). However, this portrait cannot be interpreted as a probability den¬ 
sity on T*M.", since the Wigner function is not necessarily positive. This reflects a 
problem with Weyl’s quantization map itself (at fixed h > 0). We say that Qp as 
introduced in (7.1 1 ) is positive if, for each f G Aq C Aq (seen as a C*-algebra), 


f>0 ^Qnif)>0, (7.26) 

where positivity of Qn{f) is defined in the C*-algebra (which in the case at hand 
is Bo( 7'^(R”)))- This is not the case for Q'^. Moreover, Q'^ fails to be continuous, 
and for this reason it cannot be extended to Aq (at least not in the obvious way, viz. 
by continuity). Fortunately, both problems can be resolved by a change in Qp. 
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A Strict deformation quantization of that is positive exists under the name 
of Berezin quantization, denoted by However, the fundamental idea of the un¬ 
derlying coherent states goes back to Schrodinger. For each {p,q) G and h > 0, 
define a unit vector G called a coherent state, by 

(x) = {^n%)-nl^^-ml2neipxlhe-(^-‘ifl^f^, (j21) 

Writing z = p + iq, the transition probability between two coherent states is 

= (7.28) 

In terms of these coherent states, we define : Co(7’*R”) —> Bq{L^{W')) by 

Ql{f)= j (7-29) 

Jt*w 27tn 

where the integral is meant in the sense that for each V^, ^ G L^(]R”) we have 

{(p,Qhif)w) = f{p,q){(pAn’‘^^){<^n’‘‘\w)- (7-30) 

In particular, for each unit vector \j/ G L^(K") we may write 

{W,Qh{fW)= [ dp^f, (7.31) 

ir*M" 

where p^, is the probability measure on T*W' with density 

= (7.32) 

called the Husimi function of i// € in other words, is given by 

dp^,{p,q) = (7.33) 

Weyl and Berezin quantization are related in many ways, for example, by 

es(/) = er(^^^v), (7.34) 

where Zi 2 n = L/=i(<5^/<3pj-f from which it follows that Weyl and 

Berezin quantization are asymptotically equal in the sense that for any / G Aq, 

lim||ef(/)-0^(/)||=O. (7.35) 

fi^O 

Indeed, this provides one way (among various others) of proving that satisfies 
Definition 7.1, where we note that even though is defined on all of Co(7’*R”), 
eq. (7.14) only holds on a suitable dense subspace thereof, such as C^(r*]R"). 


“Pu^tJC. T^flxLLltXLMtXLtljCjaJ. T^lLy-A-LC-A. 



7.2 Quantization and internal symmetry 


253 


7.2 Quantization and internal symmetry 


In the presence of symmetries, Dirac’s condition (7.8) can often be met by suitable 
functions / and g related to the symmetries in question, though such functions may 
be unbounded. This blasts the C*-algebraic framework, but it does so in a controlled 
way. We start with internal symmetries, like spin, which will be coupled to motion 
in the next step. Let G be a Lie group with Lie algebra g, to which we associate: 

• The “classical” Lie-Poisson manifold g*, see (3.98), whose Poisson bracket we 
now preface with a minus sign, so that instead of (3.98) and (3.99) we now have 


{f,8}-{e) = 


dmdg{e) _ 

dOa ddh 




(7.36) 

(7.37) 


We write gl for this Poisson manifold. 

• The “quantum-mechanical” rtductd group(oid) C*-algebra C*{G), cf. §C.18, 
defined as the norm-closure of ;r(C~(G)) within B{L^{G)), where 

nif)w = f*W, (7.38) 

f*V{x)= [ dyf{xy)\l/{y-^), (7.39) 

JG 

where / G C“(G) and \j/ G L^(G), cf. (C.481), and dy is Haar measure on G 
(which also provides the measure defining the Hilbert space L^{G)). 

We then obtain a continuous bundle of C*-algebras, with fibers and total C*-algebra 


Ao=C;(g); (7.40) 

An = C;{G){h>Q)- (7.41) 

A = C;(G^), (7.42) 


where g is seen as an abelian Lie group under addition, cf. Theorem C.123. We have 

C;(g)5^Co(gl), (7.43) 

which isomorphism (i.e. of C*-algebras) is given by the Fourier transform 

/(0) = f d"Ae-‘^^'^'>f{A); (7.44) 

where initially / G C~(G), and the map / / is subsequently extended to C* (G) 

by continuity. Here the normalization of Lebesgue measure d"A on g is arbitrary, but 
the normalization of d"9 is thereby fixed. In what follows, we take a (left-invariant) 
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Haar measure dx on G and fix the normalization of c/M by the condition 


7(0) = 1 


(7.46) 


in the definition of the Jacobian under the exponential map exp : G, i.e., 


7 (exp (A)) 


(7.47) 


With Ao = C~(p), the quantization map Qp, : C“(g) —>■ C* (G) is then given by 


Qn{f){e^)=n-'‘f{A/h), (7.48) 


where n = dim(G) and we assume that h > 0 is small enough that h times the sup¬ 
port of / G C~(g) is contained in an open neighbourhood t/ of 0 G g where the 
exponential map is a diffeomorphism onto some open neighbourhood t/' of e G G; 
otherwise a cutoff function should be included. Equivalently, defining Aq C Co(gl) 
as the image of C~(g) under the Fourier transform f ^ f (which consists of the 
so-called Paley-Wiener functions on g*), the map 0s : Aq — C* (G) is given by 

Qn{f){e^) = m- ( 7 - 49 ) 

Although these maps satisfy (7.14), if G is non-abelian there are no natural functions 
on g* whose quantizations satisfy the exact Dirac condition (7.8). This is a limitation 
of the C*-algebraic framework, since candidate functions like 

A : g* ^ K; (7.50) 

A(0) = 0(A), (7.51) 


whose Poisson brackets (3.99) are promising, are unbounded. However, this is eas¬ 
ily remedied by regarding C* (G) as an algebra of bounded operators on the Hilbert 
space I^{G )—which indeed is the way it was originally defined—^rather than ab¬ 
stractly. This “spatial” context allows the passage to the Lie algebra, as reviewed in 
§5.6, see especially (5.156) - (5.161). First note that (7.38) - (7.39) is a special case 
of (5.172), where H — L?{G) and u = ul, i.e., the left-regular representation 

UL{y)w{x) = '^{y^'^x). (7.52) 


In this representation, the construction (5.156) then realizes g as right-invariant dif¬ 
ferential operators on the Garding domainDg C C“’(G). By definition of C*(G), 
seen as an operator on T^(G) the function Qn{f ) is given in coordinates by 

Qnif) = d-XSiX) new l^exp ) • (7.53) 
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Here {X\ ,X„) in (7.53) are coordinates on g defined by a basis choice (7),..., Tn), 
i.e., A = The function Tj on g* is then simply given by the coordinate func¬ 

tion Tj{9) = 9j. Now take A G g and assume that / = A. This function is unbounded, 
but the following formal calculation is rigorously correct on the Garding domain and 
may be justified by some distribution theory. For simplicity we assume that G is uni- 
modular, in which case J{X) = 1 +0{X^) as X —0, so that all first derivatives of J 
vanish at X =0. Taking / = 7) in (7.53) then gives 






= -i J d’'XJ{hX)uL 

= ihu'i^iXj), 




d 


5{X) 


(7.54) 


from which we obtain 

Qn{A) = ihu'i^{A) = nL{A). (7.55) 

This explains the need for minus the Lie-Poisson bracket, since instead of (3.99) we 
now have (737), so that (5.160) gives the exact result (7.8) for f =A and g = B: 


^[QniA),QniB)] = Qni{A,B}^). (7.56) 

The minus sign in the Lie-Poisson bracket could have been avoided by writing 
f{—A/h) in (7.48), whose minus sign would have propagated into (5.159) and hence 
in the commutation relations (5.160), but the latter are so engrained in the physics 
literature that we see the minus sign on the bracket in (7.56) as the lesser evil. 

Any continuous unitary representation ui of G (where X is some label) induces a 
representation of C~(G) by (5.173), which may be extended to a representation 
of C* (G) by continuity (the same is true for C* (G) provided is weakly contained 
in L^(G), cf. §C.18). This gives operators u-f(Qnif)) which, by the same formal 
computation as for the case u = ul above, for A G g rigorously give rise to operators 

7:^{A) = ihu'^{A), (7.57) 


satisfying the like of (5.160) for fixed values of h (but without control over the limit 
/i —?► 0). Many commutation relations in quantum mechanics take this form, where 
both irreducible and reducible representations u give rise to interesting examples. 
The reducible case typically comes from group actions and is best studied using the 
formalism of action groupoids reviewed in the next section, where we will see that 
further operators start playing a role. The irreducible case, on the other hand, gives 
rise to intriguing new examples of continuous bundles of C*-algebras, where h (now 
related the label X) takes values in a discrete set and may be sent to zero, cf. §8.1. 
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7.3 Quantization and external symmetry 

We now generalize the setting of the preceding section from groups taken by them¬ 
selves to group actions. Let a Lie group G act smoothly on some manifold Q-, for 
example, we may have Q = with either G = SO{3) acting by rotations, or G = 
action by translations. We now take X = q* x Q. Recalling the notation (3.71) and 
writing 5a = dr^, we define the action Poisson bracket 

{f,8} = 

Interesting special cases arise if we take A G p and define A G C°°(g*) as before, i.e., 
A(9) = 9(A), now regarded as a function on g* x 2 (ignoring the second argument 
q). Similarly, if / G C°°(Q) we write / for the corresponding function on q* x Q 
(ignoring the^rsf argument 9). This gives the coordinate-independent expressions 


{A,B} = -M; 

(7.59) 

{A,/} = -5a/; 

(7.60) 

{/,!} = 0. 

(7.61) 


Clearly, if 2 is a point (with trivial G-action) we recover (minus) the Lie-Poisson 
structure on g*. If, on the other hand, Q = and G = acts on Q by translation, 
i.e., a • X = X -f a, we recover the canonical Poisson bracket (3.34), where the mo¬ 
menta Pa (a = 1,..., n) are identified with the coordinates 9a on the dual of the Lie 
algebra of which is just itself (with the usual basis (e\,e 2 ,e^)). Therefore, 
the Poisson bracket (3.34) on may be generalized in two ways: 

1. By passing to arbitrary cotangent bundles T*M, whose canonical Poisson bracket 
is still given in local coordinates by (3.34), which emphasizes the role of mo¬ 
menta as fiber coordinates on T*M. 

2. By passing to the setting discussed here, which emphasizes the role of momenta 
as generators of global translations of the base space (a property that breaks 
the p-q symmetry and cannot be generalized to arbitrary cotangent bundles). 

A richer structure emerges if we keep Q = R^ but now take G = E(3), i.e., 

£(3) =5G(3) KR^ (7.62) 

known as the Euclidean group. To explain its group structure, let some group L act 
on a vectors space V, seen as an abelian group under addition. Then the operations 

(A,v)-(A',v') = (AA',v + A-v'); (7.63) 

(7.64) 

turn G = LxV into a group, called the semi-direct product of L and V. 
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Then £(3) acts on in the obvious way, giving rise to the Poisson manifold 
0 * X 2 = ]R^ X X (since so( 3 ) = K^). We now also have generators (7i,72,-/3) 
of the Lie algebra of SO{3), with corresponding functions 7,, as well as standard 
coordinate functions (^ 1 ,^ 2 , 73 ) on Q = R^, giving rise to the Poisson brackets 

^ijkPki (7.65) 

=-£ijkqk', {pi,qj} = 5ij-, {qi,qj} = Q. (7.66) 

The appropriate target C*-algebra C*{G,Q) for quantization is a generalization 
of C* (G), constructed in a similar way, as explained in §C.18. For the moment it is 
enough to know that C*{G,Q) is the completion of the function space C“(G x Q), 
seen as a *-algebra in the operations (C.526) - (C.527), in a suitable norm, namely 

\\f\\r=mf)l (7.67) 

where the representation p : C~(G x Q) ^ B{L^{G x Q)) is given by (C.530). In 
case that Q has a G-invariant measure v (still with support Q), the operator 

w:L^{GxQ)L^{GxQ); (7.68) 

w\j/ix,q) = (7.69) 

is unitary, and in terms of the notation 

a{y) = wu{y)w*, n{f) = wn{f)w\ p{f) = wp{f)w*, (7.70) 

the formulae (C.528) - (C.530) take the slightly more appealing form 

u{y)v{x,q) = (7-71) 

m)w{x,q) = fiq)wix,qy, (7.72) 

P{f)wix,q) = f dyf(y,q)\l/{y-^x,y-^q). (7.73) 

JG 

The simplification thus gained especially concerns the position functions (7.72). 
Analogously to (7.49), the quanitzation maps are given by 

Gs : Co(0 *xe)^C;(G,e); (7.74) 

Qn{f){e\q) = /g, (7.75) 

where, as in the pure group case, strictly speaking / must lie in the dense subspace 
of Co( 0 * X G) consisting of Paley-Wiener functions (in A) that are the Fourier trans¬ 
form (in the first argument) of functions that lie in C~ (0 x G). 

Computations similar to (7.54) then establish, for A G 0 and / G C°°(G) as before, 

Qn{A) = ihu'{Ay, (7.76) 

Qn{f) = n{f). (7.77) 
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Form these formulae and (7.59) - (7.60), it is easy to verify that Dirac’s exact con¬ 
dition (7.8) holds in the following special cases: 


[es(A),es(B)] =G«({A,B}); 

(7.78) 

{Qh(A).Qh{f)\= Qh{{kf})\ 

(7.79) 

■[Gs(/),Gfi(l)] = Gs({/,|}) = 0. 

(7.80) 


These might be regarded as infinitesimal versions of the covariance condition 
(C.514), specialized to the case at hand. We formalize this special case as follows. 

Definition 7.2. Let G be a locally compact group and let Q be a space equipped 
with some continuous G-action. A system of imprimitivity {u{G),n{Co{Q))) for 
the given group action G O Q is a combination of a strongly continuous unitary 
representation u of G and a nondegenerate representation % of Co{Q), both defined 
on the same Hilbert space, that for each x G G and f G Cq{Q) satisfies 

u{x)n{f)u{xy = n{LJ)- (7.81) 

Here Lxf{q) = f{x^^q), as usual. We recall from §C.18 that such systems of 
imprimitivity bijectively correspond to degenerate representations p = n y\ u^ of 
C*{G,Q) through (C.515), which in the special case (C.524) - (C.525) comes down 
to 

p{f)= [ dx%{f{x,-))u{x). (7.82) 

JG 

The formulae (7.71) - (7.73) define such a system of imprimitivity on the Hilbert 
space H — L^{Gx Q). However, this cannot be the end result of quantization, since 
this space is typically reducible under the pair {u{G),n{Co{Q))), or, equivalently, 
under p {C* {G,Q)). For example, this is the case for G = or G = £(3) acting on 
2 = in the natural way discussed above, for which we obtain H 
or even H = if [E[3) x R^). In the former case we do obtain the correct posi¬ 
tion operators q\ but for the momentum operators we find the curious expression 
—ih{d/dx' + d/dq‘) —to their credit, these do satisfy the canonical commutation 
relations (7.5), since these follow from (7.78) - (7.80), which in turn follow from 
the covariance condition (7.81) defining a system of imprimitivity. 

Instead, we would prefer the Hilbert space H ~ L^(R^) expected from elementary 
quantum mechanics (without spin), equipped with the system of imprimitivity 

u{y)w{q) = (7-83) 

n{f)w{q)=f{q)w{q)- (7.84) 

The answer lies in the search for irreducible systems of imprimitivity {u{G),7z{Co{Q))), 
or, equivalently, irreducible representations of p(C*(G,Q)); see §7.5. 
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7.4 Intermezzo: The Big Picture 

First, however, we summarize and generalize the results in this chapter so far 
into what we call The Big Picture. This arose in the 1990s from efforts to relate 
Mackey’s quantization theory based on systems of imprimitivity (which Mackey 
himself saw as the natural implementation of what he called Weyl’s Program, i.e. 
the construction of the basic operators of quantum mechanics from group-theoretical 
considerations) to deformation quantization (and hence to the tradition started by 
Dirac, as continued by Groenewold, Moyal, Berezin, Flato, Rieffel, and others). 

The Big Picture is technically based on the theory of Lie groupoids (already 
alluded to in the preceding sections) and Lie algebroids. For a precise definition of 
the former we refer to Definition C.115; briefly, a groupoid G is an object like a 
group, where however multiplication is defined only partially (although the inverse 
is defined for each element). To see which elements can be multiplied, one has maps 
s,t : Gi ^ Go from the total space Gi of the groupoid to its base space Go, such 
that the product xy G Gi of x,y G Gi is defined whenever s(x) = t(y), and satisfies 
s(xy) = s{y), t{xy) = t{x), and = t{x). Four relevant examples are: 

• Spaces, where Gi = Go = Q for some set Q, with s(x) = t{x) = x for all x GGi, 
and hence xy is defined iff y = x, with result xx = x; furthermore, * =x. 

• Groups, where Gi = G and Go = {e}, with i(x) =t{x) = e for all x, so that all 
elements can be multiplied and the notion of a groupoid reduces to a group. 

• Pair groupoids over a set Q have base space Go = Q, total space Gi = Qx Q, and 
projections s{q,q') = q' and t{q,q') = q, so that {q,q'){r,r') is defined iff q' = r, 
resulting in {q,q'){q',r') = {q,r'). The inverse is given by {q,q')^^ — {q',q). 

• Action groupoids (also called semi-direct product groupoids) are important in 
what follows. These originate in some group action we denote hy GO Q, where 
G is a group and Q is a set. The ensuing groupoid is called F = Gx Q, where 

ri=Gxg, ro = Q, s(x,q) =x^^q, t(x,q)=q, (7.85) 

so that products (x,q)(y,q') are defined iff q' =x^^q, with result 

(x,q)(y,x^^q) = (xy,q). (7.86) 

Finally, the inverse is (necessarily) given by 

(x,q)^^ = (x^\x^^q). (7.87) 

A Lie groupoid is a groupoid G where Gi and Go are manifolds and all operations 
are smooth. In all examples just given this requires Q to be a manifold, and in the 
last one G should be a Lie group, and the given action G x Q ^ Q must be smooth. 

Generalizing the construction of a Lie algebra p from a given Lie group G, a Lie 
groupoid comes with an associated linearized (or “infinitesimal”) structure, called 
a Lie algebroid. As in the group case, this differential-geometric notion can also be 
defined independently of its origin in the theory of Lie groupoids, as follows: 
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Definition 7.3. A Lie algebroid E over a manifold Q is a vector bundle E ^ Q with 
a vector bundle map E ^TQ(called the anchorj, as well as with a Lie bracket [ , ] 
on the space C°°{Q,E) of smooth cross-sections ofE, satisfying the Leibniz rule 

[(yij-02]=f- [ai,02\ + {aoaif)-02 (7.88) 

for all (7i, (72 G C°°{Q,E) and f G C°°{Q) (here a o c7i is a vector field on Q). 

It follows that the map a ao a : C°°(Q,E) C°°(2, TQ) induced by the anchor 

is a homomorphism of Lie algebras, where the latter is equipped with the usual 
commutator of vector fields (this homomorphism property used to be part of the 
definition of a Lie algebroid, but in fact it follows from the stated definition). 

Lie algebroids generalize (finite-dimensional) Lie algebras as well as tangent 
bundles, and the (infinite-dimensional) Lie algebra C°°(Q,E) could be said to be of 
geometric origin in the sense that it derives from an underlying finite-dimensional 
geometrical object. Similar to the above list of examples of Lie groupoids, one has 
the following basic classes of Lie algebroids. 

• Manifolds, where E = Q, seen as the zero-dimensional vector bundle over Q, 
evidently with identically vanishing Lie bracket and anchor. 

• Lie algebras, where E = q and 2 is a point (which may be identified with the 
identity element of any Lie group with Lie algebra g) and anchor a = 0. 

• Tangent bundles over a manifold Q, where E = TQ and a — '\d\TQ-^TQ, with 
the Lie bracket given by the usual commutator of vector fields (or derivations). 

• Action algebroids (or semi-direct product algebroids) are defined by a g-action 
on a manifold Q, i.e. a Lie algebra homomorphism g — C°°{Q,TQ), A 5a, 
where we identify vector fields on Q with derivations on C°°{Q) —these are often, 
but not necessarily, obtained from a G-action on Q via see (3.71). We write E = 
g K g, which is £ = g x Q as a trivial bundle (with n the projection on the second 
space), and a{A,q) = —G TqQ, whereA G g. The Lie bracket is given by 

[(Ji,(72 ](?) = [ai{q),a 2 {q)]s + 5a^ai{q) - 5ai02{q). (7.89) 

These examples may also be recovered as special cases of the following construction 
that canonically associates a Lie algebroid Lie(G) to a Lie groupoid G: as a vector 
bundle, Lie(G) is the restriction of ker(L) to Go (where L '■ TGi ^ TGq is the 
derivative map of the source projection f: Gi —Go), and the anchor is a = s* (one 
may alternatively define Lie(G) as the normal bundle to the object inclusion map 
/: Go ^ Gi, cf. Definition C.115, but this makes the definition of the anchor a bit 
more complicated). As in the Lie group case, one may identify sections of Lie(G) 
with left-invariant vector fields on G, and under this identification the Lie bracket 
on C°°(Go,Lie(G)) is by definition given by the commutator of vector fields. 

Conversely, one may ask whether a given Lie algebroid E is integrable, in that 
E = Lie(G) for some Lie groupoid G (where the isomorphism sign = means that 
a pertinent vector bundle isomorphism E = ker(L)|(jjj should preserve all relevant 
structure). Unlike the special case of Lie groups (where Lie’s Third Theorem 5.41 
settles this in the positive), this is not necessarily the case, but that is of no concern. 
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We now state a crucial connection between Lie algebroids and Poisson geometry. 

Proposition 7.4. The dual vector bundle E* of a Lie algebroid E is a Poisson man¬ 
ifold, whose Poisson bracket on C°°{E*) is defined by the following special cases: 

{f,8}=0 if,gGC-{Q)y, (7.90) 

{a,f} = -aoaf {a GC-{Q,E),f G(T{Q)y, (7.91) 

{ai,a2} = -[(Ji,(J2], (7.92) 

where G G C°°{E*) is defined by a given section G ofE through the obvious pairing. 

Conversely, if the dual F* to a given vector bundle F ^ Q is a Poisson manifold 
such that the Poisson bracket of two linear functions is linear, then F = E for some 
Lie algebroid E over Q, with the above Poisson structure on E*. 

Following our earlier lists, the main examples are: 

• A manifold Q, seen as the dual to the zero-dimensional vector bundle Q Q, 
carries the zero Poisson structure. 

• The dual g* of a Lie algebra g acquires (minus) the Lie-Poisson structure (3.98). 

• A cotangent bundle T*Q acquires (minus) the Poisson structure defined by its 
standard symlectic structure, cf. (3.34). 

• The dual g* k 2 of an action algebroid acquires the Poisson bracket (7.58). 

The following theorem displays a rich and physically relevant class of examples 
of Definition 7.1 of deformation quantization. The key point is that a Lie groupoid 
G defines both classical and quantum data, namely the (reduced) Lie groupoid C*- 
algebra (cf. §C.17) and the Poisson manifold Lie(G)* (cf. Proposition 7.4), 

and these are continuously (even smoothly) related through the tangent groupoid 
G^ (cf. Proposition C.117) and its associated Lie groupoid C*-algebra Cj^j(G^). 

Theorem 7.5. For any Lie groupoid G, the bundle of C*-algebras given by 

Ao = CoiUe{G)*) {h = oy (7.93) 

As=C*(G) (0<;i<l); (7.94) 

A = C*(G^), (7.95) 

defines a deformation quantization of the Poisson manifold Lie(G)* over I = [0,1]. 
The same statement holds for the corresponding reduced groupoid C*-algebras. 

The key lemma for this theorem is Theorem C. 123, which provides the continuity of 
the given bundle of C*-algebras. A lengthy computation shows that also the Dirac- 
Groenewold-Rieffel condition (7.14) is met. In this light, the quantization of the 
phase space T*W‘ in §7.1 then corresponds to the pair groupoid G = M” x K" on M”, 
the one in §7.2 follows from the special case where the Lie groupoid G is “simply” 
a Lie group, and the case of §7.3, which puts Mackey’s quantization theory in a 
deformation framework, is obviously given by the action groupoid G x Q. Finally, 
the space groupoid Gq = G\ — Q gives a trivial continuous bundle of C*-algebras, 
where Api = Co(Q) for all h G [0,1], and Q carries the zero Poisson bracket. 
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7.5 Induced representations and the imprimitivity theorem 

Returning to §7.3, we recall the bijective correspondence between systems of im¬ 
primitivity {u{G),n{Co{Q))) and non-degenerate representations of the C*-algebra 
C* (G, Q) of the action groupoid defined by the given action GOQ- This correspon¬ 
dence preserves irreducibility, and our task is to find irreducible representations. 

It was recognized at least 50 years ago that this task can be carried out if the 
group action satisfies a certain regularity condition, and is hopeless otherwise. This 
is sometimes called the Mackey-Glimm dichotomy. The condition in question may 
be stated in a number of equivalent ways (whose equivalence is not at all obvious). 

First, we recall some terminology from topology. Let X be a space. One calls 
Y CY' CX relatively open in Y' if there is an open set t/ C X such that Y — Y'fMJ. 
A subset T C X is locally closed if each y GY has an open neighbourhood U inX 
such that t/ n T is closed, and finally “X is To” if for any two distinct points there 
is an open set that contains exactly one of them. Furthermore, each q G Q defines a 
G-orbit through q denoted by G • ^, as well as a stabilizer (or “little group”) 

Gq = {x G G\x-q = q}. (7.96) 

For any subgroup H C G, we denote the equivalence class of x in G/H by [x]. 

Definition 7.6. A smooth action of a Lie group G on a manifold Q is called regular 
if one and hence each of the following equivalent conditions is satisfied: 

1. Each G-orbit in Q is relatively open in its closure; 

2. Each G-orbit in Q is locally closed; 

3. The quotient space Q/G of G-orbits in Q is Tq; 

4. Each map [x] :—>■ xq is a homeomorphism from GjGq to the orbit G- q (q G Q). 
Probably the simplest example of a non-regular action is the action Z O T given by 

n:z^ (7.97) 

where 0 G R\Q (here Z may be seen as a zero-dimensional Lie group with infinitely 
many components—in fact. Definition 7.6 more generally applies to second count¬ 
able locally compact groups and spaces that are “almost Hausdorff”). Indeed, each 
orbit is dense in T (but not open), and the orbit space T/Z has no proper open sets. 

Theorem 7.7. Let a group action G O Q be regular. Then the irreducible represen¬ 
tations of the associated action groupoid C*-algebra C*{G,Q)—and hence also the 
irreducible systems of imprimitivity (m(G), 7r(Co(2))) —are classified up to unitary 
equivalence by pairs {ff,u-)f), where G is a G-orbit in Q and U), is an irreducible 
representation of the stabilizer Gq of an arbitrary point q G G, with an explicit 
construction of the corresponding representation P(^„^)(C*(G, Q)). Tvvo such rep¬ 
resentations P{ff' uf) equivalent iff G = G' and, given that q' = xq 

and hence Gqi = xGqX^^ for some x G G, u'^ is unitarily equivalent to u^, o Ad(x). 
Finally, any irreducible representation p is unitarily equivalent to some 
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In the simplest case, Q is equal to a point, so that C*{G,Q) = C*{G), and we find 
that irreducible representations of C*{G) (which are necessarily non-degenerate) 
bijectively correspond to unitary irreducible representations of G. In the next easiest 
case, G acts nontrivially but still transitively on Q, in which case the action is clearly 
regular and Q = G/H through the G-equivariant map in no. 4 of the above definition 
(read in the opposite direction), i.e., we pick some G Q, define H — Gq^, and 
finally map Q to G/H hy [x], where q = xqo (this map is well defined); in that 
case, we might as well have assumed that Q = G/H to begin with. The following 
important corollary of Theorem 7.7 is called the Imprimitivity Theorem. 

Corollary 7.8. Up to unitary equivalence, irreducible representations ofC* (G, G/H) 
(or, equivalently, of pairs {%{Cq{G / H)) ,u{G)) satisfying the covariance condition 
(7.81)) bijectively correspond to unitary irreducible representations ofH. 

In preparation for the general case stated in Theorem 7.7, and also as a goal in 
itself, we first give an explicit construction of the irreducible representation 
of C*{G,G/H) corresponding to a given unitary irreducible representation u^fH), 
where we label the unitary irreducible representations of H (up to unitary equiva¬ 
lence) by X G 7/ (where H is the set of unitary equivalence classes of unitary ir¬ 
reducible representations of H, cf. §C.15 for the abelian case), and let the corre¬ 
sponding representation p^(C*{G,G/H)) —or the pair {Cq{G/H)) and u^(G )— 

inherit this label (in raised form, in order to prevent confusion between u^ (H) and 

The construction of p^{C*{G,G/H)) —or, equivalently, of a system of imprim¬ 
itivity {Co{G/H)),u^ (G)) —from Uj(^{H) proceeds by the technique of induced 

representations (which physicists may be familiar with from the representation the¬ 
ory of the Poincare group, see Theorem 7.9 below). We start from a specific realiza¬ 
tion of Uy,{H) on a Hilbert space H^ (which is finite-dimensional if H is compact or 
abelian). From this, we construct a new Hilbert space H^, whose realization depends 
on the choice of a quasi-invariant measure V on G/H, i.e., a (non-zero) measure 
whose null-sets are G-invariant in the sense that if v(A) =0 for some (Borel) mea¬ 
surable A C G/H, then also v(x-A) = 0 for each x G G. This will surely be the 
case if v is invariant, i.e., if v(x-A) = v(A) for each measurable A, but invariant 
measures on G/H may not exist, whereas quasi-invariant measures always do. 

We now consider (measurable) functions \j/ : G^ H^ that satisfy 

\(/{xh) = u^(hr^)\^{x), (7.98) 

for every x G G and h G H-, equivalently, we may say that 

u^{h)oRhW=W, (7-99) 

for each h G H, where Ri,\ir(x) = ^(xh). Now if y/ and (p both satisfy (7.98), then, 
by unitarity of u^,, their inner product {(p{x),\I/{x))h^ in is //-invariant, in that 

{(p{xh),\i/{xh))H^ = {(P{x),\I/{x))h^. ( 7 . 100 ) 
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Hence the function x {(p{x),\j/{x))H^, a priori defined from G to C, induces 
a function [x] i—(^(x), (//(x))//^ from GjH to C. We write the latter function as 
(^, w)hji^ W; in particular, taking (p = \j/,we write || wWh^ W = (V^(-^)) W{x))h,,- We 
may then define a new Hilbert space that consists of all measurable functions 
\j/ :G ^ that for each h G H satisfy (7.98), and are square-integrable on G/H: 

I t/v([x])||v/||^,[x]<-. (7.101) 

J G/H 

This space turns out to be complete in the natural inner product 

{(p,\j/)= [ dv{[x]){(p,\j/)Hx[x] (7.102) 

JG/H 

It also carries a system of imprimitivity: in case that v is G-invariant we simply have 

M^(y)V7(x) = wiy^'x) (x,y G G); (7.103) 

7tHf)wix) = /(W)v/(x) (/ G Co{G/H)), (7.104) 

where we note that u^(y)\i/ satisfies (7.98) if \j/ does. Unitarity of as well as the 
covariance condition (7.81) are easily checked. In general, we replace (7.103) by 

where dv{\y^^■])/dv{[-]) is the Radon-Nikodym derivative of the translated mea¬ 
sure L* V with respect to v, cf. (B.137), which is well defined because by the assump¬ 
tion of quasi-invariance, L* v is absolutely continuous with respect to v (indeed, on 
this assumption they are even equivalent). Here L*v{A) = (A)), A C G/H. 

Physicists do not like the Hilbert space H^, preferring a different realization 

=L^{G/H)(g)Hx, (7.106) 

in which the wave-function \j/ is not constrained and one has a clean separation 
between the (typically) spatial degree of freedom Q = G/H and the internal degree 
of freedom H^r. One half of the system of imprimitivity will then be given nicely by 

nHf)w = fw{fGCo{G/H)), (7.107) 

but this cleanliness comes at the cost of a more complicated formula for u^{y), as 
follows. Pick a (measurable) cross-section s : G/H G, i.e., a right inverse to the 
projection p : G^ G/H, p{x) = [x], in other words, we have 


Pos = \Aqih. 


( 7 . 108 ) 
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It may not be possible to make s continuous, and, crucially, s is not a left inverse to 
p; instead, there exists a unique function hs'.G-^H such that sop(x) =x/tj(x), i.e., 

/ti(x) =x^*s([x]). (7.109) 

Such a cross-section s gives rise to a unitary isomorphism 

(7.110) 

wswiq) = v^(*(?)); (7.111) 

wf'wix) = Uxihfxffilx]), (7.112) 

which enables us to move the system of imprimitivity to by defining 

u^{y) = WsU^{y)w* (yGG)-, (7.113) 

TiHf) = WsnHf)w: (/ G Co{G/H)). (7.114) 

This duly leads to (7.107), but instead of (7.105), we obtain the more cumbersome 

u^{y)viq) = (7.115) 

where of course the square root may be omitted if v is G-invariant, as in (7.103). 


The argument h = of appearing here is called the Wigner cocycle 

(after the physicist who first introduced it in his classification of the irreducible 
representations of the Poincare group). One may verify that h G H hy applying p, 
which by construction is G-equivariant (i.e., p{xy) = xp{y)), which gives 

p{h) = p{s{qy^ys{y^^q)) = s{qy^yp{s{y^^q)) = s{qy^yy^^q = s{qy^q, 

where in the third step we used (7.108). For any x G G we havex^' [x] = [x^'x] = [e], 
so taking x = s{q) in this computation we find p{h) = [e], which is true iff h G H. 

Given an irreducible system of imprimitivity we obtain generalized 

momentum operators by passing to the associated representation of the Lie algebra 
0 of G through (5.156) and (7.57), i.e., 

n^{A) = in{u^)'{A), (7.116) 

where A G g, so that, cf. (7.78) - (7.80), we obtain from (5.160) and (7.81): 

[jf^(A),7f^(B)] = im^{[A,B])-, (7.117) 

[jf^(A),jr^(/)] = inTtHdAf)-, (7.118) 

[nHf),^Hg)]=0, (7.119) 

where A,B G g and /,g G Co(Q) (in fact, these formulae—defined on the right 
domain—work also for many unbounded functions on Q, see below), and Sa is 
defined in (3.71). Let us take a look at a few illustrative special cases: 
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• lfH = G, then Q is a point, so that C* (G, Q) — G* (G), and systems of imprimitiv- 

ity are just irreducible representations of G. We have through the map 

w : ^ defined by y/ V/(e) = y/' G with inverse (//(x) = u^(x^^)\i/'. 

This gives wu^{y)w^^ = u^{y)- Similarly, in (7.115) we take s = e, which gives 
u^{y) = u^(y) on = H^,. 

• If H = {e} we have Q = G and C*(G,G) = Bo{L?{G)), which quantizes the un¬ 
derlying classical phase space g* x G=T*G. We now have H ~ L^{G) carrying 
the left-regular representation of G. 

• Let G = E{3) act canonically on Q = Taking = 0 gives H = SO{3), so irre¬ 
ducible systems of imprimitivity are classified by 7 = 0,1,..., with corresponding 
irreducible representations Dj{SO{3)) on Hj = &+\ cf. §5.8. Hence 

(7.120) 

and using the cross-section s{q) = (la,^) from to £(3) we obtain, from 
(7.115) with (7.63) - (7.64) and (7.107), the expressions 

uj{R,a))\ir{q) = Dj{R)y/{R-\q-a))-, (7.121) 

nJ{f)mq)=fiq)W{q). (7.122) 

For 7 = 0 this gives the usual quantum theory of a spinless particle: 

1. The Hilbert space is 

2. For the generators of C £(3) we duly obtain the momentum operators 



where Pi — is defined in terms of the standard basis ( 61 , 62 , 63 ,) of 

now seen as the Lie algebra of . 

3. Using the basis (3.66) of the Lie algebra of SO{3) C £(3), we obtain the 
orbital angular momentum operators (which pick up extra terms for 7 > 0 ): 


= (7.124) 

n^(J 2 ) = ih . (7 125) 

n\j 3 ) = ih ■ (7-126) 

4. The coordinate functions f{q) = q‘ yield the position operators Qi = 7t'^(q‘): 

Qi\^(q)=q‘W(q). (7.127) 

5. Thus we obtain all the familiar commutation relations like [Qi,Pj] = ihSij, 
[jf“(yi),#°( 72 )] = ihn^{J 3 ), etc., cf. (7.65) - (7.66). 
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• Let G = M act on 2 = T, which we parametrize by z = exp(27r/^), q G [0,1), by 

a : exp(27riq) exp(27ri(q + a)), (7.128) 

so that // = Z, with = T under u^(n) = z", z G T, n G Z, cf. (C.349). We 
parametrize 77 by z = exp(/0), 0 G [0,27r), so that (with slight abuse of notation) 
ug{n) = e'”®. In the second description (i.e. the one of the physicists) we have 

^L^{T)=L^{0,\), (7.129) 

where topology of Q is lost for the moment. Using the cross-section 

(7.130) 


where qG [0,1), we obtain 

if {a)^{q) = —a + n(a,^)), 


(7.131) 


where n{a,q) G Z is the unique integer such that q — a + n{a,q) G [0,1). The 
corresponding momentum operator is formally given by the usual expression 
F = —ihd/dq, cf. (7.123), which appears to be independent of 0 (since for any 
q G (0,1) and a small enough we have n{a,q) = 0), but in fact the 0-dependence 
is in its domain, which can be shown to consist of the subspace of the Sobolev 
space //' (0,1)—i.e. the closure of C‘”([0,1]) in the inner product (5.318) adapted 
to L^(0,1), which implies //* (0,1) C C([0,1])—whose elements satisfy 

\j/{l)=e-‘\{0). (7.132) 


To see this, we recall that 


Pw = ih lim 

e-5-0 




(7.133) 


where the limit is taken in the L^-norm, so that we need existence of 

lime^^ / dq\e‘"^‘‘’‘^^^\if{q-e + n{e,q))-\ff{q)\^. 

e^O Jo 

For 0 < q < e we have n{e,q) = 1, whereas for e < ^ < 1 we have n{e,q) = 0, 
so it is convenient to split the integral as a sum of Jq and . The second term 
enforces the existence of derivatives in the L^-sense (which in turn makes iff 
continuous on [0,1]) and is unproblematic, but the first requires the existence of 

re 

lime^^ / dq\e‘^\ir{q-e + l)-\fr{q)\^. 
e^O Jo 

This strange expression, then, enforces the boundary condition (7.132). In this 
case there is no single position operator, but the algebra C(T) plays its role. 
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7.6 Representations of semi-direct products 

The case Q = G/H also provides the key for the general case, as long as the G-action 
on Q is regular, cf. Theorem 7.7. In that case, the construction of the irreducible sys¬ 
tem of imprimitivity {u{G),n{Co{Q))) corresponding to a pair ,u^{H)), where 
is a G-orbit in Q, requires no new ideas: we have = G/H, and hence u = u^ 
and ;r = ;r^ as described in §7.5 (where the function / in formulae like (7.104) or 
(7.114), which in these expression was defined on G/H, should be seen as the re¬ 
striction of / G Co{Q) to ^ C Q). An important application of this construction is 
the representation theory of regular semi-direct products Lt<V (cf. §7.3), where 
regularity means that the dual L-action on V* is regular; this action is given by 

A-0(v) = 0(A^'-v) (A GL,0 GV^.vGV). (7.134) 

Theorem 7.9. Up to unitary equivalence, the irreducible unitary representations of 
a regular semi-direct product G = LkV are classified by pairs {ff, O’), where G is 
an L-orbit in V* and O is an element of the unitary dual of the stabilizer Lq<zL of an 
arbitrary point 0o G G. The corresponding representation (G) may be realized 

from an irreducible representation u^ ofLo on a Hilbert space Ha combined with a 
cross-section s'.L/Lo^L of the canonical projection p: L~^ T/Lq, namely through 

^ i}{im^)(^Ha-, (7.135) 

u^^’°\X,v)^{e) = e'®WMf,(s(0)^Us(A^'0))v7(A^'0). (7.136) 

Proof Let m be a unitary representation of G. This implies 

m(A)m(v)m(A^') = m(A • v), (7.137) 

in which A = (A,0) and v = (e, v). Since V C G is abelian, we have C*{V)= Co(V*) 
by the Fourier transform (cf. Theorem C.109 in §C.15), which here is given by 
(7.44) - (7.45), with A v. Hence the representation u-f (C*(y)) defined by u{V) 
via (5.172), seen as a representation of Co(y*) via the Fourier transform, is given 
by 

u^{f) = {2n)-'‘ [ £/W0e'®(‘’V(0)M(v). (7.138) 

JvxV* 

Using invariance of the measure d^vd’^0 under the joint transformation (v, 0) 

(A • V, A • 0), from (7.137) we obtain, for / G Co(V*) in the image of / G Cf{V), 


m(A)mJ'(/)m(A)* = (2;r)-” f 

<7”v<7"0e'®M/(0)M(A-v) 

JvxV* 


Si 

1 

II 

(7%(7"0• A • 0)m(A • v) 

JVxV* 


Si 

1 

II 

£/%£/"0e'®M/(A-'-0)M(v) 

JVxV* 


= {Lx f)- 

(7.139) 
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Consequently, a unitary representation u(Lp<V) defines a system of imprimitivity 
{u{L),u-f (Co(V*))), and vice versa, since any pair of representations {u{L),u{V)) 
that satisfies (7.137) gives rise to a representation u{G) by m(A,v) = u{v)u{X). 

Now apply Theorem 7.7 with G-^ L and Q -^V*. All we need in order to obtain 
(7.135) - (7.136) from (7.106) and (7.107) - (7.115) is to find the representation 
u{V) that induces the representation u-f (Co(V*)) given by (7.107), namely 

m(v)v/(0) (7.140) 

as is easily checked from (7.138). □ 

In view of this, we have a remarkable group-groupoid C*-algebra isomorphism 

C*(LKy) ^C*(LKy*), (7.141) 

where the left-hand side is just the C*-algebra of the group L k y, whereas the right- 
hand side is the C*-algebra of the action groupo/c/ L k y* relative to (7.134). Also, 
a computation shows that the same formulae (7.135) - (7.136) are obtained if, given 
00 G y* and hence given Lq as its stabilizer, we define a subgroup H C Gby 

// = LoxV, (7.142) 


and induce from the representation M{ 0 o,c 7 ) of H defined by 

U(e^^^){X,v)=e‘^^^'''>Ua{X). (7.143) 

We briefly discuss four basic examples from physics, each of which is easily seen 
to be regular. We write a instead of v in (A, v) G G so as to emphasize the “spatial” 
character of y, whereas y* is labeled by a dual “momentum” variable p. 

• G = E{2) = SO(2) K defined like E (3), i.e., with respect to the usual action 
of SO{2) on (this group will play a role in the representation theory of the 
Poincare-group). We find the same action of SO{2) on (M^)* = so that the 
orbits are = {0} with Go = SO{2) and ffr = {(7t,y) G | = r^} for 

r > 0, with Gr = {e}. Thus the Hilbert spaces and representations are given by 


= C; (7.144) 

(7.145) 

H''^ L^{0,\); (7.146) 

M''(A,a)v7(p) =e'''(“‘“^P'+“2“"'’'V(p-^|modl), (7.147) 


where n G Z, A G [0,1), p G (0, 1), and p' = 2np. In the first case C E{2) is 
represented trivially, whereas in the second the r-dependence of the representa¬ 
tion lies entirely in (since and m''(A, 0 ) are evidently independent of r). 
The projective representations of G are of considerable interest, too, cf. §5.10. 

Lemma 7.10. If G = SO{p,q) k (p> Q,q > 0), then = 0. 
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Here SO{p,q) is the subgroup of whose elements leave the form 

2 2 I .2/2 I I 2 \ 

X — XiH hXp —(Xp^iH 

invariant; the best-known example is the (proper) Lorentz group 50(3,1), see 
below. This lemma may be proved by a straightforward but lengthy computation. 
By Theorem 5.59, the projective unitary representations of G then correspond to 
the ordinary unitary representations of the universal covering 

G = RkR^ (7.148) 

where M acts on through the covering projection if : M — SO{2) = M/Z, cf. 
Theorem 5.41 (with D Z). This changes the expressions (7.144) - (7.147) into 

= C; (7.149) 

(A, a) = (7.150) 

(7.151) 

= (7.152) 

where A G K, s € M, 0 G [0,27r), p G (0,1), and n{X,p) is defined as in (7.131). 
• G = E{3) = 80(3) K R^, as before with the defining action of 80(3). The 5G(3)- 
orbits in (R^)* = R^ are spheres 8^ = 80(3) / 80(2) with radius r > 0, as well as 
the origin (r = 0) with stabilizer 80(3), so that for the Hilbert spaces we obtain 

^(OJ) ^ ^2;+!. (7.153) 

//M = L^(8^); (7.154) 

where 7=0,1,... labels the unitary irreducible representations of 80(3) on Hj = 
whereas n G Z labels the irreducible representations of 80(2) on C (we 
write 8^ = 8j). In the second case, the representation of 5G(3) C E(3) 

depends explicitly on n through the Wigner cocycle; for n = 0 we simply obtain 

a<^'-’^\R,a)f(p)=e‘''P-‘‘w(R-^p). (7.155) 

For n ^ 0 we just give a formula for (R, a) in case that is a rotation around 
the z-axis and a = 0; this is enough to make the point. To this end we parametrize 
80(3) by the well-known Euler angles, i.e., in terms of the matrices 7,, cf. (3.66), 

R(y),e,a)=e‘>’^^e^-'^eP‘-'y (7.156) 

and write q G 8^ as q = (^,9) = R(^, 9,0)e^ with = (0,0,1) (the spherical 
coordinates of q are (0 — 9)). This also provides 8^ with an 5G(3)-invariant 

measure dv(^,9) = d^d9 sin 9. A convenient choice of s : 5^ — 80(3) is 

s(y),9)=R(^,9,-^), (7.157) 
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in which case we simply obtain, writing Rz{cc) = R{a,Q,Q), 

= e'"“v/(0-a,0). (7.158) 

The universal covering group of £(3) is 

I^3)=SU{2)kM.\ (7.159) 

where SU{2) = S0{3) acts on through its covering projection ft onto SO{3), 
as in the previous case. By Theorem 5.59 and Lemma 7.10, the projective unitary 
irreducible representations of £(3) are given by the unitary irreducible represen¬ 
tations of SU{2) K This obviously leads to additional half-integral values for 
j in (7.153), since this number now labels the unitary irreducible representations 
of SU (2). As to n in (7.154), the subgroup H C SU (2) that stabilizes (0,0, r) G 
consists of all matrices = diag(z,z), where z G T, so // = T and hence H = Z 
under i—z™, m G Z. We now recall from the proof of Proposition 5.5 that 

M = cos(0/2) • h + /sin(0/2)u- (7 G SU{2), (7.160) 

where u is a unit vector in projects to 7t{u) = Rg (u) G 5(9(3), i.e., the rotation 
around u by an angle 0. Parametrizing z = cos(a/2) + /sin(a/2), a G [0,47r), 
therefore gives fi{uz) = exp(a 73 ). Besides (7.157), we now also need a cross- 
section s : ^ SU{2), for which the above analysis suggests we take 

s(0,0) = (7.161) 

M^^^(0) = cos(50)- l 2 + /sin(j0)-(72; (7.162) 

= cos(^/2) • I 2 + /sin(^/2) • ay, (7.163) 

note that (a). A calculation similar to the one leading to (7.158) gives 

m^''’'”^(m2,O)v 7(0,0) =e""“/^V7(0-a,0). (7.164) 

Comparing (7.158) and (7.164), we see that if m is even, then n = m/2 (of course, 
by convention we may replace m/2 in (7.164) by n on the understanding that n 
may now be half-integral). If m is odd, choosing a = 27r we famously obtain 

iiM)(_l 2 ,o)v/ = -v/. (7.165) 

More generally, if we take a closed path t £ 2 ra(u), t G [0,1] in 5(9(3), 
which starts and ends at I 3 , and lift it (with respect to the covering projection 
ft ; SU{2) -G 5(9(3)) to a path t 1 —i> u{t) = cos(m) -f/sin(;rf)u • a in SU{2), 
which now starts at I 2 and ends at — 12 , then the corresponding representation 
,Q) takes the wave-function \j/ to itself if m is even, whereas it takes 
I// to — (// whenever m is odd (this is an embryonic version of the connection 
between spin and statistics, fully realized only in quantum field theory). 
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• G = Lk ,\he Poincare group, where the Lorentz group L = (9(3,1) consists 
of all real 4x4 matrices that leave the indefinite quadratic form 

=Xq—x^ — X2—xI (7.166) 

invariant; in this context the standard coordinates on are labeled as (xo,xi ,X 2 ,X 3 ). 
The Lorentz group has four connected components, which may be identified by 
the (independent) conditions det(A) = ±1 and ±Aoo > T For simplicity we re¬ 
strict ourselves to the connected component l\_ of the identity, in which det(A) = 

1 and Aoo > 1 • This group is called the proper orthochronous Lorentz group, 
which in turn defines the proper orthochronous Poincare group />i=4KR*. 
Writing p^ = Pq — p\— P 2 ~ p\r the L^-orbits in (R^)* = R^ are seen to be: 

1. ^0 = with stabilizer (T^)o = T+; 

2. = {p g R't I „ m^,±pQ >0},m> 0, with (L+)o = 5(9(3); 

3. = {p G r 4 I p2 ^ 0, ±po > 0}, with (4)o = £(2); 

4. ^im = {p G R"^ I = —m^,±po >0},m> 0, with (L+)o = 5(9(2,1). 

Here the stabilizers Lq are found by taking the reference points (±m, 0,0,0) in 
case 2, (± 1,0,0, — 1) in case 3, and (0,0,0, m) in case 4. The physically relevant 
cases are probably ^+2 and We pass straight to the universal covering group 

pi =5L(2,C) xR'^, (7.167) 

where the covering projection 7t: 5L(2,C) —is given analogously to the case 
(5.46). We again start from the four matrices (ob; (^1 > < 72 , < 73 ) in (5.42), and note: 

- These form a basis for the (real) vector space of all self-adjoint 2x2 matrices; 

- For any x G R‘* we have det(^|^QX^(7^) = x^ as defined in (7.166); 

- For any X G 5L(2,C) and a G M 2 (C) we have det(XflX*) = det(fl); 

- For any X G 5L(2,C) and self-adjoint a G M 2 (C), XaX* is again self-adjoint. 

Taking a = L/j-L/jCTu, it follows that for X G 5L(2,C) and x G R”^ there must be 
X G (9(3,1) such that A^^x^cr^A* = D;i(A ■x)fi<J^. By continuity and the fact 
that 5L(2,C) is connected it follows that in fact X G l\, so we put ft{X) = X. As 
for (5.46), the kernel is ker(jf) = Z 2 = {± 12 }. This enlarges the stabilizers: 

1. For we now obtain (Z,|)o = SU{2), leading to a family of unitary irre¬ 
ducible representations labeled by mass m> 0 and spin 7 = 0, j, 1,.... 

2. For the stabilizer (Z,|)o of (1,0,0,1) is a double cover P(2)' of E{2), 
whose unitary irreducible representations are labeled by either (0,n) with n G 
Z/2 (called helicity) or by r > 0. The latter case does not occur in nature. 

On the one hand, this classification is a triumph of mathematical physics, but on 
the other hand, it fails to single out which cases actually occur in nature: as far 
as we know, these are spin j = 0 and j = j and helicity n = ±1 and n = ±2. 
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• G = £(3) K the Galilei group, defined via the following E (3)-action on 

(£,v): (ao,a) iH> (ao,£a-f aov). (7.168) 

Note that v is physically interpreted as a velocity, whereas earlier a G R3c£( 3) 
was a position variable. This is clear from the defining G-action on given by 

(£,v,ao,a): (t,x) i-g (f+ ao,£x + a + tv), (7.169) 

which in fact determines the action (7.168). Either way, we obtain the group law 

(£,v,flo,a) • (£',v',ao,a') = {RR'+ R\',aQ+a'Q,a + Rsi! + a'Q\). (7.170) 

We therefore see that the role of the Lorentz group 5G(3,1) is now played by the 
Euclidean group £(3). Since from (7.170) the inverse is found to be 


(£,v,ao,a) ^ = (£ 'v,-flo,-£ ^(a-aov)), (7.171) 

the dual £(3)-action on (R^)* = R^ is given (in non-relativistic notation) by 

(£,v) : (£,p) ^{E- (v,£p),£p). (7.172) 

Hence the dual £(3)-orbits in R^ are labeled by £ G R and r > 0, as follows: 

^£ = {(£,0)}; (7.173) 

= {(£,p),£GR,||p||=r}. (7.174) 


The representations of G corresponding to the first type are basically the repre¬ 
sentations of £(3), whereas in the second case the stability group of say (0,0,0, r) 
is isomorphic to £(2). None of the ensuing induced representations of G re¬ 
produces some recognizable version of non-relativistic quantum mechanics, for 
which we need to pass to projective representations of G. These may be found 
from Theorem 5.62, which here applies in full glory, since //^(g,R) ^ 0. A 
(lengthy) computation shows that //^(g,R) has a single generator 

^((M,v,ao,a), (M',v',ao,a')) = (v,a') - (v',a), (7.175) 

where M G 20 ( 3 ), and (v,ao,a) G R^ x R^ C g = 50 ( 3 ) ©R^ ©R^ are identified 
with the corresponding Lie group elements. Eollowing the procedure culminating 
in Theorem 5.62, the central extension G is found to be (cf. (7.159) and (5.46)) 

G = £^)KR^ (7.176) 

where, writing Tt{u) = R{u), the covering group £(3) acts on R^ through 

(m,v) : (flo,a,c) H> (flo,^(M)a + flov,c+ jaoHvlp + (v,£(M)a)). (7.177) 
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Consequently, writing x = (7?, v,flo,a), for the group law in G we obtain 

(x,c) • (x',c') = {x-^,c + c' + {y,R{u)a!) + iao||v|p). (7.178) 

Eq. (7.177) implies the following dual £'(3)-action on (K^)* = 

(m,v) : (EjPjm) !->■ {E — (v,/?(m)p) + \m\\\\\^,R{ u)t^ — m\,m). (7.179) 

This time, the £'(3)-orbits in are: 

1. = {(E)0,0)} {E G M), with stabilizer £(3); 

2. ^(r.o) = {(^)PjO) I E G R, IIpII = r} (r > 0), with stabilizer £(2)'; 

3. ^u,m = {{E ,p,m) \ E—Ep = U}(mG K\{0}, U G M), with stabilizer SU{2). 

Here £(2)' C £(3) is a double cover of £(2), like the subgroup of SL{2,C) 
stabilizing the point (1,0,0,1) G in the theory of the Poincare-group. This 
time we take any point (£,0,0,r,0) G R^, which is stabilized by pairs (m,v) G 
£(3) for which R{u) is a rotation around the z-axis and v = (vi, V 2 ,0); the image 
of these pairs in £(3) is £(2) = SO{2) k R^, where SO{2) C SO{3) consists of 
rotations around the z-axis and R^ is the x-y plane. In the third case we write 
£p = ||p|p/2m and take (t/,0,m), whose stabilizer in £(3) is evidently 5(9(3). 

Thus we have massless as well as massive particles both in relativistic and in non- 
relativistic quantum physics. The simplest case of all is formed by massive non- 
relativistic particles, which correspond to the orbits ^u,m above, supplemented with 
a spin j labelling the underlying irreducible representation Dj of SU (2). Such orbits 
are diffeomorphic to R^ under the identification {U +£p,p,m) GG p, and a conve¬ 
nient choice of the cross-section s : ffu,m E{3) is s(p) = (I 2 ,—p/»j), since in 
that case the Wigner cocycle simply becomes s(p)^^(m,v)s((m,v)^^P) = u. Since 
different values of U turn out to give equivalent representations of G (in the sense 
explained at the end of §5.10), we take U — 0, and eqs. (7.135) - (7.136) become 

H'^’j = (7.180) 

{u,\,ao,a)\f/{p) = e'^"°^P+^‘*’P^^D,(M)V/(£(M)^'(p + mv)). (7.181) 

Here £^(R^) simply carries Lebesgue measure d^p, which is £(3(-invariant. 

The massive relativistic case is slightly more involved: we again have = R^ 
under ((»p,p) p, where (Bp = \/||p]F+^m^, but the Lorentz-invariant measure on 
is c/^p/cBp. For each p G R^ there is a unique boost bp G l[_ that maps (m, 0,0,0) 
to ((Bp,p), with pre-image bp in 5£(2,C), so we take s(p) = bp. The Hilbert space 
is (mutatis mutandis) still given by (7.180), but instead of (7.181) we now obtain 

M'«’^(X),a)v/(p) = (7-182) 

where a = (ao,a), X G 5£(2,C), and A G £+ the image of X under the covering 
projection. We leave the corresponding formulae for the massless case to the reader. 
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7.7 Quantization and permutation symmetry 

Another interesting application of the quantization theory developed in this chapter 
is to indistinghuishable particles. Since all elementary particles come in families 
of indistinghuishable sorts (such as electrons, photons, ...), this topic is obviously 
of fundamental importance to physics. It is also puzzling, since (as we shall see) 
mathematically one expects more possibilities than those realized in Nature (namely 
bosons and fermions). This topic is also interesting philosophically, because it ap¬ 
pears to be a testing ground for Leibniz’s Principle of the Identity of Indiscernibles 
(Pll), which states that two different objects cannot have exactly the same properties 
(in other words, two objects that have exactly the same properties must be identical). 

After a period of confusion but growing insight, involving some of the greatest 
physicists such as Planck, Einstein, Ehrenfest, Eermi, and especially Heisenberg, 
the modern point of view on quantum statistics was introduced by Dirac. 

Using modem notation, and abstracting from his specific example (which in¬ 
volved electronic wave-functions), Dirac’s argument is as follows. Let H be the 
Hilbert space of a single quantum system, called a particle in what follows. The 
two-fold tensor product = H®H then describes two distinguishable copies of 
this particle. The permutation group ©2 on two objects, with nontrivial element 
( 12 ), acts on the state space by linear extension of m ( 12 ) v/i 0 \fr2 = W 2 Wi ■ 
Praising Heisenberg’s emphasis on defining everything in terms of observable 
quantities only, Dirac then declares the two particles to be indistinguishable if 
M(12)aM(12)* = a for any two-particle observable a; by unitarity, this is to say that 
a commutes with m ( 12 ). Dirac notes that such operators map symmetrized vectors 
(i.e. those \j/ G H for which u{l2)\j/ — \j/) into symmetrized vectors, and like¬ 
wise map anti-symmetrized vectors (i.e. those \j/ G H®H for which m ( 12 ) v / = —xj/) 
into anti-symmetrized vectors, and these are the only possibilities; we would now 
say that under the action of the © 2 -invariant (bounded) operators one has 

( 7 . 183 ) 

Hl = {\pGH^\u{\2)\p=^f}-, ( 7 . 184 ) 

Hl={\pGH^\u{l2)\p = -\p}. ( 7 . 185 ) 

Arguing that in order to avoid double counting (in that xf/ and m ( 12 ) v / should not 
both occur as independent states) one has to pick one of these two possibilities, Dirac 
concludes that state vectors of a system of two indistinguishable particles must be 
either symmetric or anti-symmetric. He then generalizes this to N identical particles: 
if (ij) is the element of the permutation group on N objects that permutes i and 
j (i,j = then according to Dirac, xj/ G = 7/®^ should satisfy either 

u{i j)xp = Xj/, in which case xj/ G or u{i j)xi/ = —xj/, in which case xj/ G H^, where 
u is the natural unitary representation of ©at on H^, given, on p G ©at, by linear 
(and if necessary continuous) extension of 

u{p)xj/i (g)... (g) y/A, = v/p(i) 0 • • • 0 Xj/p^f^y (7.186) 
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Equivalently, y/ G if it is invariant under all permutations, and y/ G if it is in¬ 

variant under even permutations and picks up a minus sign under odd permutations. 
A slightly more sophisticated version of this argument often finds runs as follows: 

‘Since, in the case of indistinguishable particles, y/ 6 and u{p)yr must represent the 
same state for any p 6 and since two unit vectors represent the same state iff they 
differ by a phase vector, by unitarity it must be that u{p)y/ = c{p)y/, for some c{p) 6 C 
satisfying |c(p)| = 1. The group property u(pp’) = u{p)u{p') then implies that c{p) = 1 for 
even permutations and c{p) = ±1 for odd permutations. The choice -fl in the latter leads 
to bosons, whereas —1 leads to fermions, so these are the only possibilities.’ 

Alas, where Dirac’s argument is incomplete, this one is even inconsistent: the claim 
that two unit vectors represent the same state iff they differ by a phase vector, pre¬ 
sumes that the particles are distinguishable! Indeed, the only physical argument to 
the effect that two unit vectors y/ and yr' are equivalent iff y/ = zyf with |z| = 1, is 
that it guarantees that expectation values coincide, i.e., that 

{y/,ay/) = {y/,ay/), (7.187) 

for all (bounded) operators a, i.e., not merely for the permutation-invariant operators 
(in which case (7.187) does not follow). But, following Heisenberg and Dirac, the 
whole point of having indistinguishable particles is that an operator a represents a 
physical observable iff it is invariant under all permutations (acting by conjugation)! 

Although the above arguments therefore seem feeble at best, their conclusion that 
only bosons and fermions can exist seems validated by Nature, despite the mathe¬ 
matical fact that the orthogonal complement of in Hn (describing particles 

with parastatistics) is non-zero as soon as A > 2. This should be a source of con¬ 
cern, and indeed, much research on indistinguishable particles (in d > 2) has had 
the goal of explaining away parastatistics. Distinguished by the different actions of 
&N they depart from, these explanations have traditionally been based on: 

• Quantum observables. &n acts on the C*-algebra B{H^) of bounded operators 
on by conjugation of the unitary representation u{&n) on H^, cf. (7.186). 
One implements permutation invariance by postulating that the physical observ¬ 
ables of the A-particle system under consideration be the ©Ar-invariant operators: 
with u given by (7.186), the algebra of observables is therefore taken to be 

Mn = = {a G B{H^) \ [a,u{p)]=0{p gBn)}. (7.188) 

• Quantum states. By restriction, then also acts on the (normal) state space 

.yn{H^) = ^{H^)(2B(H^), (7.189) 

from which it is postulated that the physical state space is . 

• Classical states. &n acts on M^, the A-fold cartesian product of the classical 
one-particle phase space M, by permutation. If M — T* Q for some configuration 
space Q, we might as well start from the natural action of ©at on Q^ (pulled back 
to M^), and this is indeed what we shall do, often further simplifying to 2 = 
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Unsurprisingly, the first two approaches equivalent. Define a linear map 

(7.190) 

ai-^ ^ y' u{p)au{p)*\ (7.191) 

peeN 

this is a (normal) conditional expectation from the von Neumann algebra B{H^) 
to the von Neumann algebra B{H^)®^, i.e., £^{ 0 *) = £^{ 0 )* for all a G B{H^), 
= £n^ and II^A'II = 1. Moreover, Ej^ preserves positivity as well as the trace, so 
that it also maps the state space ^{H^) onto the invariant states C B{H^). 

Simple computations also establish the properties 

Tr(pa) =Tr(£A,(p)a) {p & &B{H^)®«y, (7.192) 

Tr(pa) =Tr(p£A,(a)) (p G a G (7.193) 

Finally, the reduction of under u{&n) described below may equally well be de¬ 
scribed in terms of the state space, since a subspace C (where e G is 

a projection) is stable under m iff e G , in which case it may be described 

in terms of the associated density operator p = e/Tr(e) G . With some 

more effort, in can be even be shown that p G iff eH is irreducible. 

We may therefore focus on the first and the third approaches, starting with the 
first, based on (7.188). Note that the C*-algebra of invariant compact operators, i.e., 

AN=Bo{H^f'^ = {aeBo{H^) \ [a,M(p)] = 0(p G 6^)}, (7.194) 

induces the same decomposition of as Mn does (since M = A'^), so if H is 
infinite-dimensional one may use A^ rather than as the algebra of quantum ob¬ 
servables; this is convenient for comparison with the classical state space approach. 
As long as dim(//) > 1 and A > 1, the algebras Mn and An act reducibly on 
. The reduction of under Mat (and hence of Aat and of u(H)^) is traditionally 
carried out by Schur duality. This rests on the following concepts. 

Definition 7.11. • A partition X ofN is a way of writing 

N = n\-\ -hn^, n\ >•••>«<: >0, ^=1,...,A. (7.195) 

• The corresponding frame (or Young diagram) E^ is a picture of N boxes with 
ni boxes in the i’th row, i = 1,... ,k. 

• Eor each frame F^, one has N\ possible Young tableaux T, each of which is a 
particular way of writing all of the numbers 1 to N into the boxes of Ex- 

• A Young tableau is standard if the entries in each row increase from left to 
right and the entries in each column increase from top to bottom. The set of 
all (standard) Young tableaux on Ex is called SXx (SX^). 

• To each T £ f^x we associate the subgroup RowjT) G &n of all permutations 
p G &N that preserve each row (i.e., each row of T is permuted within itself); 
likewise Col(r) C &n consists of all p G ©A^ that preserve each column. 
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The set Par(A^) of all partitions X of N parametrizes the conjugacy classes of 
and hence also the (unitary) dual of 6^^; in other words, up to (unitary) equivalence 
each (unitary) irreducible representation of bijectively corresponds to some 
partition X of N\ the dimension of any vector space Vx carrying is = | |, 

that is, the number of different standard Young tableaux on the frame Fx. 

Returning to (7.186), to each X € Par(A^) and each Young tableau T G 75^ we 
associate an operator er on by the formula 

= ^ E ^Sn(p)u(p) E (7.196) 

peCoi(T) p'GRow(r) 

which happens to be a projection. Its image epH^ C is denoted by Hj, and the 
restriction of to Hj is called Mn{T). One may now write the decomposition of 
under the action of Mn (up to unitary equivalence) as 

0 H^^(^Vx, 

lePar(N) 

Mn = ^ Mn{Tx)® Iv;^, 

AGPar{W) 

u{&n) = 0 IfjN '^Ux, 

LGPar{W) ^ 

where the labeling is by the partitions X of N, the multiplicity spaces Vx are ir¬ 
reducible ©AT-modules, and Tx is an arbitrary choice of a Young tableau defined 
on Fx- For simplicity we here assume that dim(//) > N; if dim(//) < N, then only 
partitions (7.195) with k < dim(//) occur. For example, the partitions (7.195) of 
= 2 are 2 = 2 and 2 = 1 -f 1, each of which admits only one standard Young 
tableau, which we denote by S and A, respectively. With Nx = M+i = 1 and hence 
Vi = Vi+i = C as vector spaces, this recovers (7.183); the corresponding projections 
e+ and e_, respectively, are given by e+ = 5(l-|-M(12)) and e- = — m( 12)). The 

bosonic states Xj/^, i.e., the solutions of 1 //+ G H^, or e+V^+ = V^+, are just the sym¬ 
metric vectors, whereas the fermionic states \j/_ G are the antisymmetric ones. 
These sectors exist for all > 1 and they always occur with multiplicity one. 

However, and this is the bite of the topic, for A^ > 3 additional irreducible rep¬ 
resentations of Mn appear, always with multiplicity greater than one; states in such 
sectors are said to describe paraparticles and/or are said to have parastatistics . For 
example, for N = 3 one new partition 3 = 2 -f 1 occurs, with Nx+i = 2, and hence 

Hj ,, (7.200) 

where Hp and Hp, are the images of the projections ep = j(l — m(13))(1 -I- m( 12)) 
and epi = j(l — m(12))(1 -|-m( 13)), respectively. The corresponding two classes of 
parastates (i.e. states carrying parastatistics) xj/p and xj/pr then by definition satisfy 
epxpp = xj/p and eprXj/pi = xj/pi, respectively. In other words, the Hilbert spaces carry¬ 
ing each of the four sectors are the following closed linear spans: 


(7.197) 

(7.198) 

(7.199) 
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H\ = span {V7i23 + W213 + V321 + W312 + Wl32 + V^23l}; (7.201) 

= span {v7i 23 — V^2i3 — W32i + W3n — Wi32 + V^23i}; (7.202) 

Hp = span^{V/i23 + V^2i3 — W32\ — W3n}', (7.203) 

Hp = span^{V/i23 + W32i — W213 — V231}, (7.204) 


where = y/,- 0 xj/j 0 and the y/, vary over H (and span^ is closed linear span). 

For any N > 2, let us note that instead of the decomposition (7.197) - (7.198), 
which is defined up to unitary equivalence, one may alternatively decompose as 


^ 0 

(7.205) 

Te3'l,XePM{N) 


Mn= 0 Mn{T), 

(7.206) 




which has the advantage over (7.197) - (7.198) that the Hp are subspaces of . 
The disadvantage is that Mn{T) is unitarily equivalent to Mn{T') iff T and T' both 
lie in 7^^ (i.e., for the same A), so that unlike (7.197) - (7.198), the decomposi¬ 
tion (7.205) - (7.206) is non-unique (for example. Young tableaux different from 
standard ones might have been chosen in the parametrization). The analogue of the 
third line (7.199) in the earlier decomposition would therefore be a mess. Indeed, 
although maps each of the subspaces Hp and H into itself (the former is even 
pointwise invariant under ©at, whereas elements of the latter at most pick up a minus 
sign), this is no longer the case for parastatistics. For example, for N = 3 some per¬ 
mutations map Hp into Hp, and vice versa. This is clear from (7.205) - (7.206): for 
X = P, one has dim(y/>) = 2, and choosing a basis (di , D 2 ) of Vp one may identify 
Hf^ and Hp^ in (7.205) with (say) Hf^ © tti and Hf^ © V 2 in (7.197), respectively. 
And analogously for A > 3, where dim(V)L) > 1 for all X ^ S,A. 

A (or perhaps the) competing approach to permutation invariance in quantum 
mechanics starts from classical (rather than quantal) data. Let Q be the classical 
single-particle configuration space, e.g., Q = to avoid irrelevant complications, 
we assume that 2 is a connected and simply connected manifold. The associated 
configuration space of N identical but distinguishable particles is . Depending 
on the assumption of (in)penetrability of the particles, we may define one of 

Qn = Q^/&n; (7.207) 

Qn = iQ^\AN)/eN, (7.208) 

as the configuration space of N indistinguishable particles, where An is the extended 
diagonal in Q^, i.e., the set of points {qi,... ,qN) G where qi = qj for at least 
one pair (i,y), i ^ j (so that for 2 = M and N = 2 this is the usual diagonal in M^). 
At first sight, these two choices should lead to exactly the same quantum theory, 
based on the Hilbert space L^{Qn) = P^{Qn)^ since An is a subset of measure zero 
for any measure used to define that is locally equivalent to Lebesgue measure. 
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However, the effect of is noticeable as soon as one represents physical observ¬ 
ables as operators on through any serious quantization procedure, which should 
be sensitive to both the topological and the smooth structure of the underlying con¬ 
figuration space. In the case at hand, Qn is multiply connected as a topological 
space, but as a manifold it is smooth and has no singularities. In contrast, is 
simply connected as a topological space, but in the smooth setting it is a so-called 
orbifold. This leads to interesting complications, but following tradition (i.e., in the 
configuration space approach to indistinguishable particle) we continue with Q^. 

To quantize Qn we use the language of Lie groupoids and their C*-algebras, cf. 
§§C.16-C.17. Let Q be any (possibly) multiply connected manifold, with universal 
covering space Q. In particular, the first homotopy group n\ {Q) acts (say from the 
right) on Q in such a way that Q — Q/lti {Q). We denote the canonical projection by 
Q- One may have the example Q = T,Q = M.,ni (Q) = Z in mind here. 

As a variation on the pair groupoid G = Qx Q,'we now consider the Lie groupoid 

GQ = Qx^^iQ)Q, (7.209) 

whose elements are equivalence classes \qi , ^ 2 ] in 6 x 2 under the equivalence rela¬ 
tion ~ defined by (^ 1 ,^ 2 ) ~ i^= q' 2 ^ for some x G Tt\{Q)\ 

the source and target projections are s([^i,^ 2 ]) = '^{qi) and f([^i,^ 2 ]) = re¬ 
spectively, the inverse is [^ 1 ,^ 2 ] * = and multiplication is the obvious one 

borrowed from the pair groupoid QxQ over Q (which is well defined on Gq). The 
tangent groupoid Gq of Gq (cf. Proposition C.l 17) has the following fiber at /i = 0: 

{Gq)1 = TQ, (7.210) 

to be contrasted with the corresponding fiber Gq =TQ of the pair groupoid on the 
covering space Q. In particular, for our configuration space Q = Qn we have 

Gqn = Qn Xj[i(q^)Qn', (7.211) 

{Gq^)1 = TQn, (7.212) 

which gives the fibers of the corresponding continuous bundle of C*-algebras as 

Ao = CoiT*QN) ih = 0y, (7.213) 

An=C*(GQ) {0<n<l), (7.214) 

cf. §C.19. This gives a generalization of the fibers (7.17) - (7.18) for Q = R", and 
also now we have an example of Definition 7.1: the fibers (7.213) - (7.214) com¬ 
bine to form a continuous bundle of C*-algebras with total C*-algebra A = C* (Gq), 
yielding a deformation quantization of the Poisson manifold T*Qn (i.e., the usual 
phase space defined by the configuration space Qn)- We now define the inequiva¬ 
lent quantizations of Q[^ as the inequivalent irreducible representations of the cor¬ 
responding C*-algebra of quantum observables C*(Gg^), as follows. 
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Theorem 7.12. 1. Let Q be multiply connected. The inequivalent irreducible repre¬ 
sentations of the C*-algebra C*{Gq) bijectively correspond to the inequiva¬ 
lent irreducible unitary representations u^ of the first homotopy group Tti (Q). 

2. Each representation has a natural realization on the Hilbert space 

H^=lHq)®Hx, (7.215) 

where Hi is a specific carrier space for the representation ui. More fancifully, 
one may use the Hilbert space L^{Q,Ei) of L?-sections of the vector bundle 

Ei=Qx^^i^q)Hi (7.216) 

associated to the principal bundle % : Qby the representation ui. 

Provided one accepts (7.208), this theorem in principle gives a complete solution 
to the problem of quantizing multiply connected configuration spaces, and hence, 
taking Q — Qn, of the problem of quantizing systems of indistinguishable particles. 

Proof We just prove Theorem 7.12 in the case we need, where ;ri (Q) is finite. Then 

( 7 . 217 ) 

BoiL^iQ)r^Q^=BoiL^m(^C*i7tiiQ)), (7.218) 

where (in our usual notation) is the C*-algebra of tti( 2)-invariant 

compact operators on L^{Q), and C*{ni{Q)) is the group C*-algebra of 7:i{Q) 
(which is finite-dimensional and hence nuclear, given the assumption that 7ti{Q) 
is finite, so that the choice of the C*-algebraic tensor product does not matter). 

To prove (7.217), we first exploit finiteness of Tti (Q) in order to identify functions 
a GCf (Gq) with constrained Cf functions a on Q x Q that satisfy 

a{qh,q'h) = a{q,q') {h&n\{Q)). (7.219) 

This identification is explicitly given by 

a{q,q')=a{\q,^]), (7.220) 

where [q,q'] denotes the equivalence class of {q,q') & Qx Q under the diagonal 
action of ni{Q). This makes the space Q°(Gg) a dense subset of C*{Gq). We write 
« G Cf{Q X for (7.208) this just means that a is a permutation-invariant 

kernel. Second, we equip Q with some measure dq that is locally equivalent to the 
Lebesgue measure, and in addition is TTi (2)-invariant under the regular action R of 
n\ {Q) on functions on Q, given, as usual, by Rh'^{q) — ifi{qh). In that case, one also 
has a measure dq on Q that is locally equivalent to the Lebesgue measure, so that 
the measures dq and dq on Q and Q, respectively, are related by 

[.dqf{q) = I Y. [ dqf{s{q)h). (7.221) 

Jq \MQ)\ heMQ)->Q 
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Here / G Cc(Q), \n\ {Q)\ is the number of elements of n\ {Q), and s \ Q'k any 
(measurable) cross-section of t: Q ^ Q. We may then define a Hilbert space L? (Q) 
with respect to dq, on which elements a of C~ {Q x qYi(Q) act faithfully by 

a\fr(q) = / dq'a{q,q')w{q')- (7.222) 

Jq 

The product of two such operators is given by the multiplication of the kernels on 
Q, and involution is defined as expected, too, namely by hermitian conjugation: 

a* {q,q) = a{q',q). (7.223) 

The norm-closure of C^{Q x represented as operators on I^(Q) by (7.222), 

is then given by ■ This proves (7.217). 

Eq. (7.218) is a special case of the following: let X be a manifold carrying a 
free action of a compact group G. If L^{X) is defined by some G-invariant “locally 
Lebesgue” measure onX, as in the construction above, then one has an isomorphism 

Bo {L}(X)f= Bo {l} (X/G)®C*(G). (7.224) 

This is proved in a similar way, realizing Bo{H) as the norm-completion of the 
Hilbert-Schmidt operators B 2 {H) (for general H), and, in the L^-case at hand, iden¬ 
tifying B 2 {L?{X)) with the algebra of operators with kernels in x 2f). 

Part 2 of the theorem now follows from the fact that for any Hilbert space H the 
C*-algebra Bo{H) of compact operators on H has exactly one irreducible represen¬ 
tation (up to unitary equivalence), i.e. the defining one (this can be proved in many 
ways, e.g. from Rieffel’s theory of Morita equivalence of C*-algebras), combined 
with the bijective correspondence between continuous unitary representations u of 
any locally compact group G and non-degenerate representations of its associated 
group C*-algebra C*(G); see §C.18, Definition C.l 19 etc. □ 

As mentioned in Theorem 7.12, there are two ways of realizing the Hilbert space 
where X labels some irreducible representation of 7ti{Q). This is very similar 
to the discussion in §7.5, so we will be relatively brief here. The first realization 
corresponds to having constrained wave-functions defined on the covering space Q-, 
for example, the usual description of bosonic or fermonic wave-functions is of this 
sort. The second realization uses unconstrained wave-functions on the actual con¬ 
figuration space Q (bad hombres confusingly call such functions “multi-valued”). 

1. The space C°°{Q,Ex) of smooth cross-sections of may be given by the 
smooth maps iff ■ Q ^ satisfying the equivariance condition (“constraint”) 

iff{qh) = uYh-^)iffiq), (7.225) 

for all hG Ki (Q), q G Q. The Hilbert space 

(7.226) 
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then, is defined as the usual L^-completion of the space of all \j/ € r{Q,Ex) for 
which The irreducible representation 7t^ {C*{Gq)) is then given on 

elements a of the dense subspace C~(Gg) of C*{Gq) by the expression 

n^{a)^{q) = [ a{[q,q'])\lf{q')-, (7.227) 

JQ 

any ;ri (2)-invariant operator on L?{Q) acts on in this way (ignoring Hi). 

If ni (Q) is finite, then two simplifications occur. Firstly, is finite-dimensional, 
and secondly each Hilbert space may be regarded as a subspace of L?{Q); the 
above action of C* (Gq) on is then simply given by restriction of its action on 
L^{Q). In that case one may equivalently realize this irreducible representation 
in terms of the right-hand side of (7.217), in which case the action of n^{a) on 
as defined in (7.226) is given by 

7t^{a)\j/{q) = [ dq'a{q,q')\j/{q'). (7.228) 

JQ 

This is true as it stands if a G C’^{Q x cf. (7.219), and may be extended 

to general ;ri (2)-invariant compact operators a G by norm con¬ 
tinuity, and, furthermore, even to strong or weak continuity. 

2. Elements of the Hilbert space L^(2,//;L)^b2) ^re typically (equivalence classes 
of) discontinuous cross-sections of E^. Possibly discontinuous cross-sections 
may simply be given directly as functions Xj/ : Q ^ H^, with inner product 

{\l/,(p)= f dq{\l/{q),(p{q))H^. (7.229) 

'^Q 

This specific realization of L^{Q,E ^) will be denoted by (Q) = C, 

L^{Q)®Hi^L^{Q). (7.230) 

These equivalent descriptions of may be related once a (typically discontinuous) 
cross-section a \ Q~^ Q of the projection T : Q has been chosen (i.e., xoa = 

idg), in which case xif{q) = \fr{G{q)). We formalize this in terms of a unitary 

(7.231) 

ufiq) = w{a{q)); (7.232) 

= ui(h)\j/{q), (7.233) 

where q — x{q), and h is the unique element of 7ti{Q) for which qh = G{q). The 
action (a) = un^ {a)u^^ on now follows from (7.228) - (7.233): If a 

is a %\ (Q)-invariant kernel on L^{Q), then using (7.221) we obtain 

7t^{a)\j/{q)= Y, f dq'a{a{q),a{q')h)ui{h)\j/{q'). (7.234) 

heKiiQ)-'^ 
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We now apply this formalism to N indistinguishable particles moving on the 
(single-particle) configuration space Eq. (7.208) then gives the A^-particle space 

= (7.235) 

The universal covering space of this multiply connected space is 

= = (7.236) 

which (unlike its counterpart in t/ = 2 ) is connected and simply connected, so that 

7ri(eiv) = 6 yv. (7.237) 

It follows from (7.217) and (7.237) that the algebra of observables is given by 

C*(Gqjv) (7.238) 

Comparing (7.238) with (7.194), we obtain a complete equivalence between the 

“quantum observables” approach and the deformation quantization approach based 
on Theorem 7.12, in that the configuration space approach through the representa¬ 
tion theory of the groupoid C*-algebra C*{GQf^) leads to the same classification as 
the “quantum observables” approach based in (7.188) above, cf. (7.197) - (7.199). 
We discuss a few interesting special cases. 

N = 1. Here Q\ = Q\ = and tti {Qi) = {e}, so the algebra of observables is 


C*{Gq^)=Bo{L^{R^)), (7.239) 

which has a unique irreducible representation on 
N = 2. This time, the pertinent homotopy group is 

ni{Q2) = &2 = I‘2 = {e,{l2)}, (7.240) 

which has two irreducible representations: firstly, ub{p) = 1 for both p G & 2 , 
and secondly, (e) = 1, (12) = —1, each realized on = C. Hence with 

q = {x,y,z) G eq. (7.225) yields 

Hi = {\i/GL^(R?f I Wi‘l2,qi) = r(?i,?2)}; (7.241) 

Hj = {\l/GL^{R^f I \l/{q2,qi) = -w{quq2)}- (7.242) 

Here ^ C*-algebra 

C*(Ge 2 ) xK3))®2 (7.243) 


consists of all © 2 -invariant compact operators on x K^), acting on or 

hI in the same way as they do on cf. (7.228), noting that the constraints 

in (7.241) and (7.242) are preserved due to the © 2 -invariance of A € C*(Gg 2 )- 
This recovers Dirac’s description of statistics given earlier in this section. 
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N = 3. Here we have a non-abelian homotopy group 

?ri(03) = 63 , (7.244) 

which, besides the irreducible boson and fermion representations on C, has an 
irreducible parafermionic representation up on Hp = C^. This representation is 
most easily obtained explicitly by reducing the natural action of ©3 on C^. Define 
an orthonormal basis of the latter by 



It follows that C • eo carries the trivial representation of © 3 , whereas the linear 
span of ei and 62 carries a two-dimensional irreducible representation up, given 
on the generators (12), (13), and (23) of ©3 by 

„,(12) =.( „,(„) = 1 ( ;0) ; „H23) = ( -' J) - 

(7.246) 

We already gave realizations of the Hilbert space Hp of three parafermions 
in (7.203) and (7.204),where it emerged as a subspace of L^(K^) (g)L^(K^) 0 
L^(]R^) = L^(K^ X X K^). An equivalent realization = Hp may be given 
on the basis of (7.225), according to which H^ is the subspace of L^(K^)^ (g)C^ = 
L^(]R®) (g)C^ that consists of doublet wave-functions y/,- ii = 1 , 2 ) that satisfy 

2 

Wi{qp{i),qp( 2 ),qp('i)) = E “o( 7 ’)V^;(?i>? 2 ,? 3 ), (7.247) 

./=1 

for any permutation p G © 3 , where u = up, cf. (7.246). I.e., the parafermionic 
wave-functions in this realization of Hp are constrained by the conditions 


\l/l{q 2 ,quqi) = 2ri(?l,?2,?3 )-jV^r2 (?l,?2,?3); (7.248) 

V2{q2,quq3) = -^V3\i/i{quq2,q3)- kV2{qi,q2,q3y, (7.249) 

\l/l{q3,q2,qi) = 5 t/i (^i,^2,?3) + jV^r2(?l,?2,?3); (7.250) 

V2{q3,q2,qi) = ^V3\i/i{quq2,q3)- lV2{quq2,q3)'-, (7.251) 

Viiquq3,q2) = -t/i(^i,^2,?3); (7.252) 

W2{q\,q3,q2) = W2{q\,q2A3)- (7.253) 


The algebra of observables C* (Gq^ ) of three indistinguishable particles without 
internal degrees of freedom, i.e., then acts on H^ C l 2(K^)^ 0 as in (7.234), 
identifying a G C*(Gqj) with a® I 2 (so that a ignores the internal degree of 
freedom C^). This representation is irreducible by Theorem 7.12. 
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N > 3. The above construction may be generalized to any N >3. There will now 
be many parafermionic representations uiofQ^ (given by Young tableaus), each 
of which induces an irreducible representation of the C*-algebra (7.238). 

The question now arises whether parastatistics is to be found in Nature—or, in¬ 
deed, if this question is even well defined! As a warm-up to the case N = 3, where 
the question first plays a role, let us give an alternative realization of {C*{Gq^)), 
cf. Theorem 7.12. Take two isospin doublet bosons (which by definition transform 
under the defining spin-j representation D 1/2 of SU (2) on C^). With 

//P) = (7.254) 


and using indices 01,02 = 1,2, the Hilbert space of these bosons is 

^ “ {v^ ^ ^ I (Vazai (?2:?l) = V^aia2 (?L ?2)}) 

with corresponding projection given by 


l'2'l 

V4ia2(?l>?2) = 5(V42ai(?2,?l) +V4ifl2(?L?2))- (7.256) 


Subsequently, define a partial isometry w : —>■ by 

wwiqi^qi) = wo{qi,q2) = ■^{Wn{q\,q2)-W2i{qi,q2))- 


(7.255) 


(7.257) 


Physically, this singles out an isospin singlet Hilbert subspace within 

//P), where eo = w*w (which is a projection). This singlet subspace may be con¬ 
strained to the bosonic sector by passing to 

//W =eoeP)//(2); (7.258) 

( 2 ) ~ 

note that eo and ^ commute. Now extend the defining representation of C*{Gq^ ) 

on l2(k3)82 JQ //(2) by ignoring the indices 01,02 (i.e., isospin is deemed unob- 

( 2 ) 

servable). This extended representation commutes with eo and with and hence 
is well defined on C . Let us denote this representation of Ggj by . It 
is then immediate from the property \i/o{q 2 ,qi) = —Wo{q\,q 2 ) that: 

Proposition 7.13. The representations 7rjl’\c*(GQ2)) on and {C* {Gq^)) on 
are unitarily equivalent. 

In other words, two fermions without internal degrees of freedom are equivalent 
to the singlet state of two bosons with an isospin degrees of freedom, at least if 
the observables are isospin-blind. Similarly, two bosons without internal degrees of 
freedom are equivalent to the singlet state of two fermions with isospin, and two 
fermions without internal degrees of freedom are equivalent to the isospin triplet 
state of two fermions (this corresponds to the Schur decomposition of (C^)®^ under 
the commuting actions of 62 and SU (2)). 
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For = 3 we may carry out a similar trick as for N = 2, and replace parafermions 
without (further) degrees of freedom by either bosons or fermions. We discuss the 
former and leave the explicit description of the various alternative descriptions to 
the reader. We proceed as for N = 2, mutatis mutandis. We have a Hilbert space 

//(3) = (7.259) 

of three distinguishable isospin doublets, containing the Hilbert space ' of three 
bosonic isospin doublets as a subspace, that is. 


hP = {\l/G H^^'> I Wap^i)ap^2rp^3){qpil),qp{2},qp{3)) = Vaia2a3{quq2,q3) ip G 6i)}. 

(7.260) 

The corresponding projection, denoted by ^ will not be written 

down explicitly. Define an SU{2) doublet (y/i, v/ 2 ) within the space through a 
partial isometry 


w\i/i{quq2,q3) = ^{¥m{qi,q2,q3) - Vniiquqi^qs))', 


(7.261) 

(7.262) 


wW2iqi,q2,q3) = ^{-^¥2ii{qi,q2,q3) + ¥i2iiqi,q2,q3) + Wii2{qi,q2,q3))- 

(7.263) 


Defining a projection 62 = w*w on the Hilbert space contains a closed 
subspace = e 2 e^g'^H^^\ which is stable under the natural representation of 
C*{Gq^) (since 62 and commute). We call this representation ■ An easy 
calculation then establishes: 

Proposition 7.14. The representations ng\c*{GQ^)) on and {C* (Gq^)) on 
(as defined by Theorem 7.12) are unitarily equivalent. 

In other words, three parafermions without internal degrees of freedom are quiva- 
lent to an isospin doublet formed by three identical bosonic isospin doublets (corre¬ 
sponding to the Schur decomposition of (C^)®^ under the commuting actions of ©3 
and 5t/(2); in this decomposition, the spin 3/2 representation of 51/(2) couples to 
the bosonic representation of 63 , whilst the spin-j representation of 51/(2) couples 
to the parafermionic representation of © 3 ), at least if the observables of the latter 
are isospin-blind. Many other realizations of parafermions in terms of fermions or 
bosons with an internal degree of freedom can be constructed in a similar way. 

For A > 3 we similarly find that the representation of the C*-algebra (7.238) 
induced by some parafermionic representations of &N is unitarily equivalent 
to a representation on some 51/(n) multiplet of bosons with an internal degree of 
freedom; the appropriate multiplet is the one coupled to in the Schur reduction 
of (C”)®^ with respect to the natural and commuting actions of ©at and 51/ (n). 
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The moral of this story is that one cannot tell from glancing at some Hilbert space 
whether the world consists of fermions or bosons or parafermions; what matters is 
the Hilbert space as a carrier of some (irreducible) representation of the algebra 
of observables. From that perspective we already see for N = 2 that being bosonic 
or fermionic is not an invariant property of such representations, since one may 
freely choose between fermions/bosons without internal degrees of freedom and 
bosons/fermions with internal degrees of freedom. In a more systematic discussion 
using superselection theory one may impose some physical selection criterion in or¬ 
der to restrict attention to “physically interesting” sectors. Such criteria (which, for 
example, would have the goal of excluding parastatistics) should be formulated with 
reference to some algebra of observables. Such issues cannot be settled at the level 
of quantum mechanics and instead require quantum field theory, where parastatis¬ 
tics can always be removed in terms of either hose- or fermi-statistics, in somewhat 
similar vein to our discussion. For (nonlocal) charges in gauge theories there are no 
rigorous results, but historically a similar goal played a role in the road to quantum 
chromodynamics (QCD), which is one of the ingredients of the Standard Model. 

A different argument against parastatistics arises from the state space approach 
based on the compact convex set studied at the beginning of this sec¬ 
tion. The extreme boundary dg consists of one part that is contained 

in de^{H^) = and one that is not. The first part consists of those one¬ 
dimensional invariant projections e € (//^)®^ whose image eH^ belongs to ei¬ 
ther the bosonic subspace (in which case u{p)e = e for each p G ©at) or the 

fermionic subspace of (in which case u(p)e = sgn(p)e for each p G ©at); 

in other words, ptrre bosonic on fermionic states on are also pure on 

The second part, then, consists of parastatistical pure states on B{H^)‘^^, 
which are therefore mixed on B(H^). Furthermore, pure bosonic or fermionic 
states on both extend and restrict to pure bosonic or fermionic states on 

2 j(^A'+i)©Af+i respectively, whereas parastatistical pure states 

turn out to have neither property and hence are “isolated” at the given value of N. 

Finally, in c/ = 2 the equivalence between the operator and configuration space 
approaches breaks down, because f '^\{Qn) = i.e., the braid group on N 

strings. Even defining the operator quantum theory on Hp] = L^{Qn), with algebra 
of observables Mn = B{L^{Qp]))^^, fails to rescue the equivalence, because the de¬ 
composition of Hp! under by no means contains all irreducible representations 
of Bp^. In this case deformation quantization gives many more sectors than the im¬ 
proved operator approach (which already gave more sectors than the approach using 
‘multi-valued’ scalar wave-functions). 
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Notes 

The quotations in the preamble are from Dirac (1947), p. 87. Similarly, the Dreimanner- 
arbeit (Born, Heisenberg, & Jordan, 1926) bluntly states (in Ch. 1, §1) that: 

‘one can see from eq. (5) [i.e., pq — qp — —ih ■ 1//, cf. our eq. (7.5)] that in the limit h = 0 , 
the new theory would converge to the classical theory, as is physically required.’ 

§7.1. Deformation quantization 

In the wake of Dirac’s famous insight on the analogy between the Poisson bracket 
and the commutator in quantum mechanics, the idea of deformation quantization (in 
the form of what we now call star products) may be traced back to Groenewold 
(1946) and Moyal (1949). The mathematical (physics) literature on the subject 
started with Berezin (1975) and Bayen et al (1978), who introduced what we now 
call formal deformation quantization, in which h is not a real number but a formal 
parameter occurring in formal power series. The C*-algebraic setting for deforma¬ 
tion quantization we use was introduced by Rieffel (1989, 1994); see also Landsman 
(1998a), Chapter 2, for a detailed treatment. 

§7.2. Quantization and internal symmetry 

This section is based on Rieffel (1990) and Landsman (1998a), Chapter 3. 

§7.3. Quantization and external symmetry 
§7.4. Intermezzo: The Big Picture 

§7.5. Induced representations and the imprimitivity theorem 
§7.6. Representations of semi-direct products 

The action Poisson bracket (7.58) was introduced by Krishnaprasad & Marsden 
(1987); see also Marsden & Ratiu (1994). 

Systems of imprimitivity and their applications to representation theory, semi- 
direct products, and quantum mechanics are due to Mackey (1958, 1968), who 
was inspired by Weyl (1927, 1928), von Neumann (1932), and Wigner (1939). As 
Mackey (1978, 1992) describes, he saw his work as the development of what he 
calls Weyl’s Program. Weyl (1927) posed two questions in quantum mechanics: 

1. ‘How to construct the matrix of Hermitian form* that represents some quantity 
given in the context of a known physical system?’^ 

2. ‘Given this Hermitian form, what is their physical meaning, and which physical 
statements can we make about it?’^ 

Weyl considered the second question to have been resolved by von Neumann’s 
recent work, and so he concentrated on the first, which he tried to answer using 
group theory. The main achievement of Weyl (1927), elaborated in his subsequent 

* Like Hilbert himself, Weyl at the time still thought of operators in terms of matrices or Hermitian 
forms, rather than abstractly, like von Neumann. Also cf. our Introduction. 

^ ‘Wie komme ich zu der Matrix, der Hermiteschen Form, welche eine gegebene GroBe in einem 
seiner Konstitution nach bekannten physikalischen System reprasentiert?’ (Weyl, 1927, p. 1) 

^ ‘Wenn einmal die Hermitesche Form gewonnen ist, was ist ihre physikalische Bedeutung, was 
flir physikalische Aussagen kann ich ihr entnehmen?’ (ibid.) 
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book Weyl (1928), was a reformulation of the canonical commutation relations 
i[p,q] =h - in terms of projective unitary representations of the additive group 
(or, equivalently, of unitary representations of the associated Heisenberg group). 
He also introduced the formula (7.21) in an equivalent form where the (classical) 
Fourier expansion of /, i.e., 

f{p,q)= f dadbe'‘^P+‘^‘lf{a,b), (7.264) 

is “quantized” by the operator in which exp(/ap + ibq) in the above formula is re¬ 
placed by the (projective) unitary representative u{a,b) of {a,b) £ just men¬ 
tioned, i.e., the real numbers p and q are replaced by the corresponding operators p 
and q, as in (7.3) - (7.4). In particular, Weyl treated p and q symmetrically. 

In his development of Weyl’s Program, Mackey broke the symmetry between p 
and q, in that he saw the momentum operator p as the (“infinitesimal”) generator 
of a unitary representation of the additive group M, whereas the position operator 
q was replaced by a projection-valued measure on the real line; this is equivalent 
to a nondegenerate representation of the commutative C*-aIgebra Co{Q), as in our 
discussion in §7.3. This way of tearing p and q apart was the key to the general case 
of quantizing group actions on configuration space discussed in §7.3. 

In their independent elaboration of Weyl’s ideas, Groenewold (1946) and Moyal 
(1949) emphasized the deformation aspect of quantization (including the classical 
limit) rather than its group-theoretical underpinning; the former aspect is completely 
absent in Mackey’s work. “The Big Picture” (Landsman, 1998a, Ch. 3; Landsman & 
Ramazan, 2001; Landsman, 2007) is an attempt to have the best of both worlds, in 
that the role of Lie groupoids delivers the symmetry aspect of quantization, whereas 
our (i.e. Rieffel’s) very definition of quantization puts the deformation aspect in the 
front seat. The underlying theory of Lie groupoids and Lie algebroids may be found 
in Moerdijk & Mrcun (2003) or Mackenzie (2005); see also Landsman (1998a). 

A comprehensive study of the Mackey-Glimm dichotomy may be found in 
Williams (2007), which contains a wealth of information on crossed product C*- 
algebras and induced representations in general. 

The representation theory of the Poincare-group was first studied (using some¬ 
what heuristic methods) by Wigner (1939) using induced representations. The entire 
subject was subsequently taken up and finished by Mackey. For treatments in the 
spirit of (mathematical) physics see e.g. Simms (1968), Niederer &. O’Raifeartaigh 
(1974), and Barut & Ragka (1977). Lemma 7.10 is proved by Bargmann (1954). 

Among the known elementary particles, the case j = 0 (and m > 0) corresponds 
to the Higgs boson, whereas 7=5 gives all known fermionic particles (i.e., elec¬ 
trons, quarks, neutrino’s, and their antiparticles). If one counts the gauge bososn W± 
and Z() as massive, they provide the case 7 = 1, but in the fundamental Lagrangian 
they are massless and correspond to helicity n = ±1, like the photon. Helicity ±2 
gives the graviton. We discard particles predicted by supersymmetry, which evi¬ 
dently does not exist in nature (this evidence seems lost on string theorists). 
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§7.7. Quantization and permutation symmetry 

This section is based on Landsman (2016a). The literature on indistinguishable 
particles is enormous, initiated by Heisenberg (1926) and Dirac (1926). What we 
call the “quantum observables” approach goes back to Messiah & Greenberg (1964); 
see also Driihl, Haag, & Roberts (1970). Key papers in the configuration space 
approach are Souriau (1967), Laidlaw & DeWitt-Morette (1971) and Leinaas & 
Myrheim (1977). More generally, for the quantization of multiply connected space 
see Dowker (1972), Schulman (1981), Isham (1984), Horvathy, Morandi, & Su- 
darshan (1989), Morchio & Strocchi (2007), and Morandi (1992). The state space 
approach to indistinguishable particles was proposed by Bach (1997), who proves 
(7.192) - (7.193), as well as the claim following these equations to the effect that 
p € iff eH is irreducible. The state space arguments against parastatis- 

tics given near the end of this section are also due to Bach (1997). 

The representation theory used in this section may be found in many books, such 
as Weyl (1928), Fulton (1997), or Goodman & Wallach (2000). 

The groupoid (7.209) is a special case of the so-called gauge groupoid defined 
by a principal //-bundle P ^ Q, where Gi = P XhP (which stands for {P x P)/H 
with respect to the diagonal //-action on P x P), Gq = Q, and the operations are 

^{[p,q]) = ‘^{q), t{[p,q]) = 7:{p), [x,y]~^ = \y,x], [p,q][q,r] = [p,r]-, 

here [p^q\[q',r\ is defined whenever n{q) = 7t{q'), but to write down the product 
one picks some element q G %^^{q'). 

Recent philosophical literature on indistinguishable particles includes French & 
Krause (2006), Barman (2010), Caulton & Butterfield (2012), Saunders (2013), and 
Baker, Halvorson, & Swanson (2015). This philosophical literature stills needs to 
be integrated with the mathematical approach launched in this section, and it was 
indeed the goal of Barman, Halvorson, & Landsman (2013ish) to do so. Alas! 





Chapter 8 

Limits: large N 


Beside the limit /i —0, we consider the limit N where N could be the principal 
quantum number labeling orbits in atomic physics (as in Bohr’s Correspondence 
Principle), or the number of particles or lattice sites, or the number of identical 
experiments in a long run measuring the relative frequencies of possible outcomes. 

The case of large quantum numbers will be dealt with first: as our toy model 
of an classical orbit we take a coadjoint orbit in the dual g* of the Lie algebra g 
of a compact connected Lie group G, see §5.9; for G = SU{2) or SO{3) these are 
simply two-spheres S^. The corresponding quantum theories are indexed by their 
spin j = \n, where n G N, which we send to infinity in order to recover the classical 
orbit. This can be done more generally by rescaling the highest weight X of some 
fixed irreducible representation of G to nX and again letting 

The second case, where the limit oo is typically the thermodynamic limit 
(namely if the density A^/V is kept fixed, where V is the volume of the system sent to 
infinity, too), has been rigorously studied using operator algebras since the 1960s. In 
such work the system constructed at the limit A^ = oo is typically quantum statistical 
mechanics in infinite volume, whose existence (followed by the establishment of 
e.g. phase transitions) was a major achievement of mathematical physics. 

However, our goal in taking the limit A^ —oo is quite different, in that—in the 
spirit of Bohrification—our limiting system will be classical, from the traditional 
point of view we look at the macroscopic rather than the quasi-local observables. 
Nonetheless, for each finite value of N G N our (quantum) system will be the same 
as in the usual theory! Like the first case, in which increasingly large matrix alge¬ 
bras converge to an algebra of continuous functions on some compact space, this 
apparent miracle is described by the theory of continuous bundles of C*-algebras, 
as outlined in §C. 19. As in the case studied in the previous chapter, this theory 
provides a convenient mathematical machinery for studying the limit N ^ also. 

We then apply the the limit N to N repeated experiments, and, applying the 
doctrine of classical concepts, rederive the Born rule (avoiding the conceptual and 
mathematical pitfalls of various previous attempts to do so). 

Bridging the gap to the next two chapters, we close with an introduction to quan¬ 
tum spin systems (as a later playing ground for spontaneous symmetry breaking). 
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8.1 Large quantum numbers 

As in §5.9, let G be a compact connected Lie group with Lie algebra g and dual 
0 *, and let r C G be a maximal torus with Lie algebra t and dual t*. Let ffx be a 
regular integral coadjoint orbit in g*, labeled by a dominant weight X & A^. This 
means that there is a point 9 G whose stabilizer Gg is T, and X = 9^g conversely, 
X Gi* determines 9 G g*, which vanishes on each generator £« of gc (a G A). 

Following Theorems 5.49 and 5.51, we associate a unitary irreducible represen¬ 
tation ux : G —>■ U{Hx) to ffx (or rather to A), whose underlying Hilbert space Hx 
contains a unique highest weight vector Vx- We then have (5.228). We abbreviate 

dx=dim{Hx). (8.1) 

For SU{2) we have X G No/2 = {0, j, 1,...}, usually called j, and the (regular) 
coadjoint orbits in g* = are the spheres 5? with radius j (with j ^ 0). The corre¬ 
sponding highest weight representation uj is carried by Hj with dj =2/4-1, whose 
highest weight vector Vj is an eigenvector of L 3 = iu'{S^) with eigenvalue j. 

We are going to define a continuous bundle of C*-algebras over the base space 

/=(1/N)U{0} = 1/N, (8.2) 

where N = {1,2,...} and A = NU {°°}; as required, I contains 0 as an accumulation 
point. One may think of elements of I as “quantized” values of Planck’s constant 
h=\ /N, upon which the limit N ^ is formally the same as the limit h^Q. 

If A G A^, then nX G A^ for all n G N. We may therefore define the C*-algebras 

Ao = C(^x); (8.3) 

A„, = B{H,x). (8.4) 

For each / G C{ffx) we define fx = Tt* f under the canonical projection n : G ^ 

GjGg = ffx (i.e., fx{x) = f{n{x))), which enables us to define the operators 

Ql/n{f)=dnlJ^dxfx{x)\u„xix)v„x){Unx{x)v„x\&Ai/„. (8.5) 

In fact, the entire integrand in (8.5) is a function on ^x^ because for z G T we have 
(xz) V„x =u„x(x) U„x (z) V„x = XnX (z) U„X (x) V„x, 

XnX (z) G T cancels the factor XnX (z) from the last term in (8.5). Note that 

2i/«(l^x) = (^-6) 

as follows by taking V ^2 = V ^3 = V„x in Schur’s well-known orthogonality relations 

d„x / dx{\j/i,u„xix)¥2}{unx(x)¥3,¥4} = {¥u¥4}{¥3,¥2) {¥i&H„x)- (8.7) 

JG 
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Other properties of the maps Qxjn '■ C{Gx) ^ (between C*-algebras) are: 

• Self-adjointness, i.e., Qi/nif)* = Qi/nif*)- 

• Positivity, i.e., Qi/„{f) > 0 whenever / > 0. 

• Equivariance , i.e., writing Lyf{x) = fiy^^x) as usual, for any y G Gwe have 

Q\/n{Lyf) = U„x{y)Qlln{.f)UnX{y)* ■ ( 8 - 8 ) 

Positivity does not follows from self-adjointness, as Qxjn is not a homomorphism. 

Theorem 8.1. There exists a continuous bundle of C*-algebras A over I as defined 
in (8.2), with fibers (8.3) - (8.4), whose continuous sections are given by all se¬ 
quences G which aq G C{ffx) and axjn G B{H„x), and the 

sequence {ax/njneH asymptotically equivalent to (0i/„(ao))nGN. in the sense that 

lim l|fli/«-Gi/n(ao)|| =0. (8.9) 

n—^oo ' ! 

In particular, if / G C{0x)^ then the cross-section of defined by 


ao = /; (8.10) 

ai/„ = Gi/«(/), (8.11) 

is continuous. In fact, we have a deformation quantization of Gx in the sense of Defi¬ 
nition 7.1, where the Poisson structure of 0x i® inherited from (minus) the canonical 
one on the Poisson manifold g*, but we shall merely prove the claim of the theorem. 

Proof. This will follow from Proposition C.124, in whose notation A (which will 
actually coincide with A) consists of all a = {api)hei where / runs through C{^x) ™ 

do = f; (8.12) 

dx/n = Ql/nif)- (8.13) 

To verify the conditions for Proposition C.124 we start with the property that the set 
{dfi I a G A} be dense in Ap,', we will show that it even coincides with An- Ath = Q 
this is true by construction. Ath= 1/n, the required property 

Qi/n{Cii^x))=B{H„x) (8.14) 

can be proved in two steps. For simplicity we set n = 1; the proof is the same for 
any n G N. The first step is to define a function La on G for each a G B{Hx) by 

La{x) =Tv{a\ux{x)vx){ux{x)vx\) = {vx,ux{x)*aux{x)Vx). (8.15) 

This function is continuous and is right-invariant under T, so that La is really an 

element of C{ffx). Thus we have a map L : B{Hx) C{ffx)^ a i—^ La- Furthermore, 

{a,Ql{f))HS = {LaJ)2, (8.16) 
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where the Hilbert-Schmidt inner product on left-hand side is {a,b)HS = Tr(fl*h), 
cf. (B.495)—which is well dehned since is hnite-dimensional—and the right- 
hand side is the inner product on with respect to the measure induced by the 

subspace of L?{G,dx -dx) consisting of T -invariant functions. Now 0 i/„(C(i^;l)) is 
a (necessarily closed) linear subspace of B{Hx), which coincides with B{Hx) iff its 
orthogonal complement in the Hilbert-Schmidt inner product is zero. 

Hence (8.14) is equivalent to the implication: a G {Qi/n{C{^x)))^ a = 0. By 
(8.16), the antecedent holds iff {La,f )2 = 0 for each / G C{^x)^ which, because 
is dense in holds iff La = 0. Hence the the above implication is 

equivalent to: = 0 ^ a = 0, i.e., kerL = {0}. We must therefore prove the latter. 

If La{x) = 0 for all x G G, then, taking x = exp(fiAi) • • •exp(f„A„), where each 
A, G 0 , and applying (5.156) for each f, to the right-hand side of (8.15), we obtain 

{vx,WxiAn),---Wx{A2),WxiAi)M]---]^x)^0. (8.17) 

This equality extends to gc, so we may take A,- = Ea^ for some positive root a, G A +. 
Since u'^{Ea)'0x = 0 for a G A+, of each commutator (£’aj.),a] only the term 
u'^{Eai)a contributes. Moving the u'^{Ea,) to act as u'j^{Eai)* = u'^{E^ai) on the 
vector on the left in the inner product in (8.17) gives all other eigenvectors of t, so 
that (8.17) implies {\if,aVx) = 0 for each y/ G Hx, and hence aVx = 0. Now it is 
clear from (8.15) that La^{y)*aui{y){^) = ^^a{yx), so if La{x) = 0 for all x G G, then 
also La^(^yYauy{y){x) = 0 for allx G G. Hence we may replace a by ux{y)*aux{y) in 
the above argument, hnding ux{y)*aux{y)vx = 0 and hence aux{y)vx = 0 for each 
y G G. Since ux is irreducible, this implies ay/ = 0 for any xj/ G Hx, and hence a = 0. 

This completes the proof of (8.14). Proposition C.124 furthermore requires 


lim||ei/«(/)l| = ll/l|oo, (8.18) 

This follows from the following key property (to be proved at the end); 

^^^{UnXiy)'Onl,Ql/nif)Unxiy)'Onx) = hiy), (8-19) 

71—>-oo ' 

for any y G G and f GC{ffx)- Indeed, for any y G G we obviously have 

llGl/«(/)ll > {UnX{y)'0„X,Qlln{f)UnX{y)'0nx) ■ ( 8 - 20 ) 

Since G and hence ffx compact, by Weierstrass’s Theorem there is an y G G such 
that |/;i,(y)| = ||/||oo- Using this y in (8.20) and (8.19), the two of these imply 

hminf ||( 2 i/„(/)||>||/|U. ( 8 . 21 ) 

Conversely, for any unit vector i/r G eqs. (8.5) and (8.7) imply 

{¥,Ql/n{f)¥} = \{V,Ql/nif)V)\ < ll/lloo. (8.22) 


If / is real-valued, then Qi/Xf)* = Qi/nif*) = Qifnif)- In that case, ( 8 . 22 ) implies 
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ll<2i/«(/)ll<ll/l|oc. (8.23) 

By the C*-identity ||fl*a|| = ||a|p, this is true for any / G C{ffx). Therefore, 

limsup ||ei/„(/)|| < ll/IU- (8.24) 

n^oo 

Eqs. (8.21) and (8.24) yield (8.18). It remains to prove (8.19), i.e., 

limd„x dxfx{x)\{u„x{y)v„i,u^i{x)v„x)\^ =fl{y). (8.25) 

n^oo Jq 

The key to the proof is the fact that if X and /r are dominant weights, with associated 
highest weight representations and m^u, respectively, for any x G G one has 

{vx,ui{x)vx) ■ {v^,u^{x)v^) = {Vi+^,ux+^{x)vx+^). (8.26) 

Namely, because the exponential map is surjective for compact connected Lie 
groups, eq. (8.26) is equivalent to the property 

{vx,u'^{A)vx) + {Vf,,u'^{A)Vf,) = {vx+^,u'^^^{A)v^+^), (8.27) 

for any A G g. For A G t this amounts to A + = A + /r, cf. (5.228), whereas for 

A = Ea for some root a G A we have 0 = 0, so that (8.27) is true for all A G g. This 
also proves (8.26), of which we need the special (and iterated) case 

= {Vx,Ux(x)Vx)". (8.28) 

This motivates us to introduce a sequence (/r„) of probability measures on G by 

di^„{x) = d„x ■ dx\{vi,u;,{x)vi)\^", (8.29) 

so that, after a change x^yxof the integration variable, eq. (8.25) reads 

lim dn), f dlin (x) fx (yx) = fx (y) , (8.30) 

for any / G C(ffx)- Now F{x) = \{vxtUx{x)vx)\ takes values in (0,1] and hence 
the measure (8.29) is c//r„(x) ~ exp(—n5(x)) for S{x) = —ln(T'(x)), with 5 > 0 and 
S{x) =0 iff X G Ge^ = T (using regularity of the orbit). In that case, i.e., if z G T, 
then fxiyz) = f{7z{yz)) = f{n{y)) = fx{y)- The method of steepest descent shows 
that any part of G (of positive Haar measure) where S{x) >0 makes no contribution 
as n —^ oo, so that we may replace fx (yx) in (8.30) by fx (y), obtaining 

lim f dn„{x)fx{yx)=fx{y)lim f (7jU„(x) =/^(y) lim 1 =/a,(y). (8.31) 

fl—>co Jq n —>co Jq >oo 

We have now verified conditions 1 and 2 in Proposition C.124, and no. 3 is trivially 
satisfied since in condition 1 we have equality with A^, as shown above. □ 
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8.2 Large systems 

We now move from large quantum numbers within a single system to large quantum 
systems that consist of N identical sites, where we eventually study what happens 
as N ^ (as is customary in quantum statistical mechanics we change notation 
from n gNioN G N). This limit gives rise to two different continuous bundles 
and A^'^^ of C*-algebras over I as given by (8.2), which have exactly the same fibers 
at 1 /N but, amazingly, differ dramatically at = oo, i.e., 1 /N = 0. This difference 
reflects two choices one may make for the A^-particle observables that have a limit 
asN ^ namely local ones, giving rise to a highly non-commutative limit algebra 
Aq^^ (which is the one usually studied in quantum statistical mechanics of infinite 
systems), and macroscopic ones, which generate a commutative algebra A^'^) of ob¬ 
servables of an infinite quantum system (describing classical thermodynamics as a 
limit of quantum statistical mechanics). It is the latter that we need for Bohrification. 

Let B be a fixed unital C*-algebra, describing a single quantum system. The 
case of a two-level system, where B — M 2 (C), is already fascinating, and many 
other interesting examples are described by finite-dimensional C*-algebras. Though 
irrelevant in finite dimension, we note that the constructions below are generally 
valid if (for technical reasons to be found in Proposition C.97) we use the projective 
tensor product (§)max between C*-algebras; see §C.13. For any S N we put 

^S/iv=^S/iv=^"> (8-32) 

i.e., the A^-fold (projective) tensor product (8)max^ of ^ itself. Furthermore, 

= C{S{B))-, (8.33) 

(8.34) 

where S{B) is the state space of B, seen as a compact convex set in the weak*- 
topology, as usual, and B°° is the infinite (projective) tensor product of B with itself 
as described in §C.14; see especially (C.318) with C, = B for each i. For example, 
the state space of B = M 2 (C) is affinely homeomorphic to the unit ball in whose 
boundary is the familiar Bloch sphere of qubits; see Proposition 2.9. 

We now explain how (8.32) and (8.33) - (8.34) give rise to continuous bundles 
A^'^^ and A^”^) of C*-algebras, starting with the former. First, for each A^ G N, let ©a^ 
be the permutation group (i.e. symmetric group) on N objects, acting on B^ in the 
obvious way, i.e., by linear and continuous extension of 

= (8.35) 

where A, G B. This yields a symmetrization operator Sn ■ B^ -G- B^ defined by 

peeN 
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If B is infinite-dimensional, these maps can be extended by continuity to the com¬ 
pletion B°° = (§)max^ '^he algebraic tensor product indeed, passing to any 

faithful representation of B it is easy to see that is even continuous with respect 
to the minimal cross-norm (cf. §C.13). For N >M we then define 

Sm,n -.B^ ^B^ (8.37) 

by linear (and if necessary continuous) extension of 


5M,A'(ai/M) =® 1 b) (8.38) 

with N — M copies of the unit Ig € B so as to obtain an element of B^. Clearly, 
Sn,n = Sn- In particular, Si^n ■ B ^ B^ gives the average of b over N copies of B: 

1 ^ 

Si.N{b) = TT 52 <8 1b - • • <8 Is, • (8.39) 

^ k=\ 

For example, take B = M„ (C) for simplicity, and pick some a = a* G B and X G 
(7(a), with associated spectral projection ex - Putting b = ex in (8.39) gives 

=Si,N{ex)- (8.40) 

This is a frequency operator: applied to states of the kind Di 0 • • • 0 Dat G (C")^, 
where each Vi is an eigenstate of a, so that aD, = A,D, for some A, G (7(a), the 
corresponding operator counts the relative frequency of X in the list (Ai,..., Aa?). 
The commutative case B — C{X) provides a classical analogue. Eq. (C.271) gives 

B^ =C{Xf ^C{X^), (8.41) 

so that, identifying elements of B^ with functions on X^, for f G C{X) we have 

Si,Nif)ixi,...,XN) = ^ I^(/(.^i) + ---/(.^w))- (8.42) 

k=\ 

We return to the construction of a continuous bundle of C*-algebras with fibers 
(8.32) and (8.33). As in §8.1, we construct this bundle by specifying a preliminary 
family of continuous cross-sections and then using Proposition C.124 to finish. 

Definition 8.2. We say that a sequence (ai/A')A'GN> with a\ji^ G B^, is symmetric 
when there exist M G N and aij^i G B^ such that for each N >M one has 

ai/N = SM,N{ai/M)- (8.43) 

This implies ai/^r = Smiai/M)- Symmetric sequences can start in any finite way 
they like, but their infinite tails consist of averaged observables. Hence symmetric 
sequences asymptotically commute: if (aj/Ar) and (^i/a^) are symmetric, then 
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,lii^l|fli/A'^i/A'~^i/AfQi/A'llflW —0) (8.44) 

simply because the commutators of single-site operators are nonvanishing only at 
finitely many positions, upon which the factor 1 /N in (8.39) guarantees (8.44). 

For example, if B = M 2 (C), and (o;) are the Pauli matrices, we have 

[S,^N{\ha,),S,,N{\na2)] = (8.45) 

et cetera, showing that the averaged spin-j operators effectively rescale h by h/N. 

In view of this, it is reasonable to expect that we may be able to assemble the 
algebra into a continuous bundle whose limit algebra at = oo is commutative. 
For each symmetric sequence we define a function ao : S{B) ^ Chy 

ao((o) = lim (8.46) 

where (O G S{B), and (O^ € S{B^) is defined by linear (and continuous) extension of 

co^{bi 0 • • -^bN) = (o{bi)- ■ ■ co{bN)', (8.47) 

continuity of (O^ on the algebraic tensor product 0^5 (and hence extendibility to 
Ai/^) is guaranteed by Proposition C.98, although this is not really needed here 
because ao only requires the values of (O^ on 0^5 itself. In any case, the limit exists 
by definition of a symmetric sequence, from which we also see that ao G C{S{B)), 
because it is a finite sum of finite products of the type (o{bi) ■ ■ ■ Co{bM), each of 
which is continuous in co by definition of the w*-topology on S(B). 

For example, the frequency operators (8.40) define a symmetric sequence (/^ )a?gn> 
whose the limit function : S{B) — C in the sense of (C.560) or (8.46) is 

f^{(o) = co{ex). (8.48) 

Thus (8.46) gives the Born probability for the outcome a = X in the state cu; see 
§8.4. Classically, identifying elements of S{C{X)) with probability measures on 
X, the limit of the sequence for fixed / G C{X), cf. (8.42), is 

aoiB)= [ dBf- (8.49) 

Jx 

This convergence is an example of the strong law of large numbers, see §8.3. 

We return to the general case. 

Definition 8.3. A sequence (ai/A^)AfGN os above is quasi-symmetric if for eachN G 
N one has aijj^ = SNiaijt^) and for any e > 0 there is a symmetric sequence 
and some M gN such that ||fli/Ar — ai/Ar|| < £ for all N > M. 

For example, if limA/^oo ||ai/A/~ Qi/Ar|| = 0 for some fixed symmetric sequence 
then (fli/Ar)A?GN is obviously quasi-symmetric. 
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Theorem 8.4. For any unital C*-algebra B, the C*-algebras (8.32) and (8.33), i.e., 

= C{S{B))- (8.50) 

All'll, (8.51) 

where B^ is N-fold projective tensor power are the fibers of a continuous 

bundle of C*-algebras over / = (1 /N) U {0} = 1 /N whose continuous cross- 
sections are the quasi-symmetric sequences (oi/aj) with limit oq given by (8.46). 

As in Theorem 8.1, also here we have a deformation quantization of S{B) in the 
sense of Definition 7.1, where the Poisson bracket on S{B) may be defined by spec¬ 
ifying its value on linear function b G C(S(B)), where b G B and b{(o) = (o{b), by 

{a,b} = i\c^]. (8.52) 

Unfortunately, this involves the theory of infinite-dimensional Poisson manifolds, 
which we prefer to omit. Thus we shall only prove Theorem 8.4 as stated. 

The proof relies on Stprmer’s quantum De Finetti Theorem 8.6 below. 

Definition 8.5. Let B be a unital C*-algebra. A state p on B^ is called: 

• permutation-invariant if po = p for any p g&n- 

• Ai-exchangeable (K G^} if it is permutation-invariant and in addition p is the 
restriction to B^ of some permutation-invariant state on B^^^. 

• Infinitely exchangeable if it is K-exchangeable for all /f G N. 

The set of all permutation-invariant states / K-exchangeable states / infinitely ex¬ 
changeable states on B^ is denoted by (B^) / (B^)- 

Theorem 8.6. Let B be a unital C*-algebra. For any N G N the correspondence 
0)^ GG CO, where CO G S{B) and CO^ G S{B^), cf. (8.47), gives a bijection 

deS^^iB^)^SiB). (8.53) 

This theorem was originally stated (in the language of infinite tensor products) as 
Theorem 8.9 in §8.3, where it (and hence Theorem 8.6) will also be proved. 

We also need a formula for the norm of any self-adjoint element a of any C*- 
algebra A in terms of the state space A and the pure state space P{A), viz. 

||a|| = sup{|a)(a)|; co G 5(A)} = sup{|a)(a)|, O) G P{A)}. (8.54) 

This follows from Proposition C.15, the spectral radius formula (B.254), and com¬ 
pactness of O’(a), implying that the supremum in (B.254) is reached on a{a). 

Proof The proof of Theorem 8.4 is quite similar to the proof of Theorem 8.1, in 
that we once again rely on Proposition C.124, where the symmetric sequences are 
going to play the role of A. To apply Proposition C.124, we should prove that: 
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1. The set Ao (consisting of all ao G Aq = C{S{B)) as defined by (8.46), where {ai/fj) 
runs through all symmetric sequences) is a *-algebra which is dense in Aq. 

2. For any symmetric sequence with limit ao as given by (8.46), one has 

lim ||ai/A,|| = llaolloc. (8.55) 

N^OC ' 

To prove the first claim, we first note that ao is the linear span of all finite products 
(o{bi) ■ (o(bN), where N gN and bi,... ,bN G B. Since (o{b) = (o{b*) this is obvi¬ 
ously a *-algebra. The monomials b{(o) = (o{b) already separate points of S{B) C 
B*, since if O)' ^ 0) then clearly is there some b G B for which (o) — Co'){b) ^ 0. 
Hence claim no. 1 follows from the Stone-Weierstrass Theorem B.51. 

For the second, let be a symmetric sequence. Since there are M G N and 

^i/M G b!^ with a\jM = SMiaijAt) = Sm,m+k{oi/m) for all /f G N, 

IIai/mII = sup{|p(ai/M)| ; p G S{B^)} = sup{|p(ai/M)| : p G 
l|fli/{M+;r)ll = sup{|p(5M,M+/r(dM))| : P G 

= sup{|p(fl~i/M)|:pG5f^(B")}, 

where we used (8.54) and (8.43). Theorem 8.6 and (8.46) then yield (8.55): 
hhn ||ai/Ar|| = jim ||ai/(M+^)|| 

= sup{|p(ai/M)| : p G 

= sup{|p(ai/M)|: p G (b")} = sup{\co^{a i/m)\ : CO G S{B)} 

= sup{| lim ®^(ai/^)| : co G S(B)} = sup{|ao(a))| : CO G 5(B)} 

N^oo ' 

= 11 do 11 oo 


The proof that the sequences (aj/^) for which condition (C.552) in Proposition 
C.124 holds are precisely the approximately symmetric sequences is the same as the 
proof of the equivalence of the two conditions in Lemma C.125, taking Hq — 0. 

Finally, it is easy to show that the limit (8.46) exists also for quasi-symmetric 
observables a: take e > 0 and find a and M as in Definition 8.3. For this d, let Mq be 
such that (8.43) holds (with M Mo). For all N,N' greater than both M and Mo, 

\co^{ ci\/n) — co'^ {ci\lN')\ < l®^(ai/w —di/^) — (oi/AT/—di/^/)| 

+ l®^(dl/Ar) — (di/Ar')| 

< \\o\/N — 0\In\\ + ||ai/A" —di/Ar/)|| +0 

< 2e, (8.56) 

since ||(»^|| = 1. Hence (©^(oi/^)) is a Cauchy sequence (in C). □ 

Our second continuous bundle of C*-algebras of interest is described by the fol¬ 
lowing changes in Definitions 8.2 and 8.3. 
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Definition 8.7. Let B be a unital C*-algebra and let G for each G N. 

• A sequence is called local when there exist M G N and ai/m G 

such that for each N > M one has 

ai/;v = ^i/M ® 1 b ® ® Ibj (8.57) 


with N — M copies of the unit \b G B (so that indeed G B^). 

• A sequence (Ai/^)AtGN is quasi-local if for any e > 0 there is a local sequence 
(aj/^) and some M G N such that ||ai/^ — Aj/^H < e/or all N > M. 

For the right analogue of Theorem 8.4 we recall the description of the infinite tensor 
product cf. §C.14, especially the explanation preceding (C.315). Accordingly, a 
dense subspace of B” is given by equivalence classes of local sequences {ai/N)Nen 
under the equivalence relation a ~ a' iff lim^r^oo IIai/at —Aj^^|| = 0; the C*-algebraic 
operations in B” are inherited from the B^, and if we denote the equivalence class 
of (ai/a^)^? by [ai/atJat, the norm in B“ is given by 

l|[fli/wWII = l|ai/ivll- (8.58) 

By construction, this number is independent of the representative {ai^i^)^ in the 
class [ai/atJa'- By definition, B” is the completion of the space of these equivalence 
classes in the norm (8.58). As explained after (C.315), for each M G N we have an 
injective (and hence isometric) homomorphism (pM ■ B^ -G B°° that maps ai/m G B^ 
to the equivalence class [Al/A^]^^ of the sequence (Ai/Ar)A? defined by 

ai/A, = 0, {NKMf (8.59) 

a\if] = aiiM, {N = M)-, (8.60) 

ai/(M+ts:) = oi/M® Ib®---® 1b, {K>Q), (8.61) 

with K copies of 1 b. It is easy to verify that one might as well have started from 
quasi-local sequences and their equivalence classes, for which the limit (8.58) exists 
by an argument similar to (8.56). In that case the ensuing C*-algebra is already 
complete, which leads to a direct description of the elements of B” as equivalence 
classes of quasi-local sequences. This fact also follows from the following analogue 
of Theorem 8.4, which may be proved in the same way, i.e., from Proposition C. 124, 
where this time the elements of A are local sequences rather than symmetric ones 
(in fact, the proof is much easier, since this time we obtain (C.552) for free): 

Theorem 8.8. For any unital C*-algebra B, the C*-algebras (8.32) and (8.34), i.e., 

A^^'’ = B“; (8.62) 

(8.63) 

are the fibers of a continuous bundle of C*-algebras over I = I /N whose con¬ 
tinuous cross-sections are the quasi-local sequences (ai/at) with limit aq = [ai/a^Ja'- 
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8.3 Quantum de Finetti Theorem 

As an initial step in exploring the connection between the bundles and 
we prove Theorem 8.6, which we first restate in an equivalent form. Let be the 
group of bijections of N that differ from the identity only on a finite set. Each such 
finite permutation p G &oc defines a map ap: B°° ^ B°°, as follows. Let 5 C N be the 
finite subset of N on which p acts nontrivially (if 5 = 0 we have p = idp^, in which 
case also ttp = idfi"-, see below). Take a local sequence (ai/Ar)w, so that (8.57) holds, 
in which we may assume M > max5; we also redefine = 0 for each N < M. 
For each N >M > max5, the map p may be regarded as an element pfj of Qm by 
restriction to {1,... ,A^} C N and hence p acts on B^ by permuting the entries in 
elementary tensor products of operators, cf. (8.35). For each p G &oc, define a map 

ap\B°° ^ (8.64) 

ci-p{[a\iN\N) = [ap^^(ai/A,)]iv. (8.65) 

This uses a specific representative of the equivalence class [a\ipi\N G B°°, but 

nonetheless the map Up is well defined. Furthermore, since each : B^ -G- B^ is 
an automorphism (i.e., an invertible homomorphism), it is an isometry, so that also 
Up is an isometry on its domain and hence extends to an automorphism of B°°. The 
ensuing map pt-G- Up from 6oo to the group Aut(B“’) of all automorphisms of B°° is 
a homomorphism of groups, and we say that ©«, is an automorphism group of B°°. 

Writing for the set of all ©oo-invariant states on B”, i.e., p G 

iff p o ttp = p for each p G ©oo, we may now rephrase Theorem 8.6 as follows: 

Theorem 8.9. Let B be a unital C*-algebra. There is a bijection 

^5(B), (8.66) 

given by -fL co, where (0 G S{B), and G S(B°°) is defined by, cf. (8.47), 

= lim (8.67) 

' N^OO ' 

This is essentially the same as Theorem 8.6: for anyM G N, a state onB^ is infinitely 
exchangeable iff it is the restriction of an element of 5®“°(B“’) to B^ C B°°, where 
the inclusion is given by the map (pM defined below (8.58). 

Proof. Let S{B) C 5®°°(B“) under the map O) 0)°°. We first show the inclusion 

C5(B) (8.68) 

contrapositively, i.e., if p G 5®°°(B“) does not lie in S{B), then p has a nontrivial 
convex decomposition in 5®°°(B“). We identify B^ with (Pn{B^) C B°° and denote 
the restriction of p to B^ by p^. If p = w” for some co G S{B), then 

PM+ts:(fli/M ® ~ (8.69) 
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for each G and a'jy^ G . If (8.69) holds whenever 0 < a\if^ < I^m, then 
by Lemma C.53 and (C.8) it always holds. Adding suitable multiples of the unit and 
rescaling, it follows that if (8.69) holds whenever 

j • IflAr < ^\/M — I' (8.70) 

then it always holds. Therefore, if (8.69) fails, then it fails for some satisfying 

(8.70) and some and in which case j < However, such a failure 

implies the existence of a nontrivial convex decomposition 

p=tp' + {\-t)p", (8.71) 

with t = PM{ci\if^), and the functionals p' and p" on B°° are defined by 

P'{[^1/n]n) = ^^^PM+N{a'\lM (8.72) 

P"([ai/At]w) = Pm+a'((1bA'- aj/^)(g)ai/Ar)/PM(lBM(8.73) 

These limits exist on symmetric sequences (where they stabilize), and hence they 
exists in general. Furthermore, since Pm(1bm '^he property (8.71) 

is obvious. Both p' and p" belong to S^°°{B°°), since each functional Pm+n is an 
element of {B^"). Finally, (8.71) is nontrivial, since if p' = p", then p^ = p^, 
and hence (8.69) would hold (whose violation we assumed). This proves (8.68). 
Though it is always true, for simplicity we prove the converse inclusion 

5(B) C 5,5®“ (B“) (8.74) 

just for the case where B is generated by projections, as in the case B = M„{C), 
B = B{H), or B a von Neumann algebra, or more generally an AW*-algebra (see 
§C.24). In that case also each B^ is generated by its projections. 

For each p G 5®“(B“), each A G N, and each projection e G B^, we have 

PN{ef <P2N{e®e), (8.75) 

see below. Assuming (8.75), suppose O) G 5(B) and (0°° = fp' + (1 — t)p" for some 
f G (0,1) and p^p” G 5®°°(B“’). Since = O)^, we then have 




( pN{e)yft \ \ 
\p'^{e)y/i^t) / 


( PNie)Vt \ ( PN{e)'/t \\ 
\p'^{e)y/t^t) \p'^{e)^/Y^t) / 


< tp2N{e®e) + {\-t)p2f^{e®e) 
= = ft)^(e)^, 


“Pu^tJC. T^flxLLltXLMtXLtljCjaJ. T^lLy-A-LC-A. 



306 


8 Limits: large 


where the inner product in the first line is the usual one in and, noting it is 
positive, we have used the Cauchy-Schwarz inequality for this inner product, as 
well as (8.75). Hence both inequalities must be equalities, and for the first one this 
implies p'f^(e) = p'^{e). Since this is true for all N and all projections in , this 
implies p' = p" = co°°, so that co°° € and (8.74) has been established, 

up to the proof of (8.75). To this effect, note for each M G N and f G K we have 

PainH^b^ -+e(8>l5Af(8>---(8>l5Af+f-l ^mn)^) (8.76) 

= M{M- \)p2N{e®e) +MpN{e)+2tMpN{e)+P', (8.77) 

with M — 1 copies of 1 and e moving from right to left in the first line, leaving 
M terms before the final one t ■ I^mn in (8.76). In working out the square in (8.76) 
and moving to the second line we used = e as wel as permutation invariance of 
the state Pmn- The point is that (8.76) is positive, so that (8.77) must be positive, 
too, for all M G N and f G M. Now a function /(f) = f^ + 2bt + c = {t + bY —b^ + c 
obviously satisfies /(f) > 0 for each f iff b^ < c, so that (8.76) is positive for all f iff 

M^PN{ef <M{M-\)p2N{e®e)+MpN{e). 

Letting Moo gives (8.75). □ 

Taking B — C{X) for some compact Hausdorff space X, in view of (8.41) the 
situation may be transferred to the Cartesian product X^, equipped with the product 
topology (which is generated by products Ai x • • • x Aat C X^ with each A, C X 
open) and the ensuing Borel a-algebra (generated by the above products with each 
A, Borel). If are (probability) measures on X (in which case we write 

Pi G Pr(X)), then there is a unique (probability) measure pi x ■ ■ ■ x p^ whose value 
on a product as above is equal to pi (Ai) • • • Pn{Ais[). In particular, any probability 
measure p G Pr(X) on X defines a probability measure p^ on X^. 

The symmetric group &n acts on X^ in the obvious way, and hence its acts on 
the power set ^{X^). We call the latter action so that for p G &m we have 

x---xAn) =Ap(i) X ••• xAp(^). (8.78) 

The Cartesian product X°° = is well defined both topologically and measure- 
theoretically (the topology is generated by all products with finitely many A, 
open and different from X, and likewise for the Borel structure), and the infinite 
symmetric group ©oo = Un&n acts on it in the obvious way, in that p G &n C 6oo 
permutes the first N coordinates. Specializing Definition 8.5 to B = C(X), we obtain; 

Definition 8.10. A probability measure Vn on X^ is called: 

• permutation-invariant if Vn o = Vn for any p G &n- 

• Ai-exchangeable (K gN) if it is permutation-invariant and in addition is the 
restriction to B^ of some permutation-invariant probability measure on X^^^. 

• exchangeable if it is K-exchangeable for all K gN. 
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A probability measure V„ on X°° is called permutation-invariant ifv^ o = Voo 
for any p G &n cind € N, where acts on YiiAi by (8.78) on the first N factors 
Ai,... jAat whilst acting trivally on all remaining A,- ’s. 

The connection between the two parts of this definition is that Vn is exchangeable 
iff it is the restriction to of some permutation-invariant measure v.* on X°°. 
From Theorems 8.6 and 8.3 we obtain the Hewitt-Savage Theorem: 

Corollary 8.11. LetX be a compact Hausdorff space. For any N € N, any infinitely 
exchangeable probability measure Vn on X^ takes the form 

Vn= I dP{p)p^ (8.79) 

7pr(X) 

for some probability measure P on Pr(7f) that is uniquely determined by V^, and 
similarly for N = °°, where Voo is a permutation-invariant probability measure. 

The two claims in the theorem are equivalent by the remark after Definition 8.10. 

The probability measure P G Pr(Pr(7f)) has the following interpretation. For G 
N and (xi,... ,xv) G X^, define the so-called empirical measure ’ on X as 

= (8.80) 

i=\ 

where 5x is the Dirac measure on X. Seen as a map on C{X), this is the same as 

/(x,). (8.81) 

JX tv 

Given a probability measure Vn on X^, these formulae give a random probability 
measure on X depending on a drawing from X^, i.e., a map 

^ Pr(X); (8.82) 

(xi,...,x^) (8-83) 

Proposition 8.12. The probability measure P in Corollary 8.11 is given by 

lim [ dPNF= f dPF, (8.84) 

A'^~JPr(X) JPr{X) 

for each F G C(Pr(7f)) (that is, P = limAr^ooPx weakly), where Pn G Pr(Pr(7f)) is 
the probability measure on Pr(7f) defined by Vn G Pr(7f^) and (8.82) - (8.83), i.e., 

Pn{A) = Vn{E^\A)) (AcPr(X)). (8.85) 

Proof. By the Stone-Weierstrass Theorem it suffices to prove (8.84) for linear com¬ 
binations of monomials like F{pL) = /r (/i) • • • (/x). where f\,... ,fK GC{X) are 
arbitrary and /r(/) = J^dpf. This is a simple computation; using (8.85), we have 





308 


Limits: large N 


[ cIPnF = j 

JPv{X) JX" 

f ^ / 1 ^ \ 

= J^^dVNixi,... ,Xm) n ( ^ E//(^') 1 


where in the third step we used (8.79). The result follows, since clearly 

lim f dP{ll) [ dfi^{xu---,XN)Y[\ljT.fj(^in = 

Pr(x) Jx^ /=1\-^S / 

f dP{n) [ dlJ.{xi)fi{xi)--- [ dn{xK)fk{xK) = f dPF. □ 

JPr(X) Jx Jx JPi{X) 

We can also say more about the limit of the sum (8.81), So far, we have been 
dealing with the Borel cr-algebras C and 3§oo C generated by 

the topology (i.e., by the open sets). On top of this, consider ,5^ C defined 
as the (7-algebra generated by the permutation-invariant Borel subsets of X^, or, 
equivalently, as the smallest cr-algebra for which the permutation-invariant Borel 
measurable functions on X^ are measurable. Likewise, C regarding A C 
X^ as a subset A x of X°°, we have ^oc = For any permutation- 

invariant probability measure Vn on X^, the Hilbert space L? {X,S^x, Vn) is a closed 
subspace of L?{X^,^n,Vn), and the associated conditional expectation 

; l\x^,^m,Vn) ^ L^{X,^n,Vn) (8.86) 

is defined as the corresponding orthogonal projection. Since C{X^) cL2(x^),this 
map restricts to C{X^). Similarly for N = °°. For each N gN, and also for N = °°, 
we may regard / G C (X) as a function fx on X^ through 


/jf(xi,...,Xiv) =/(x/f) K=\,...,N. (8.87) 


Proposition 8.13. Let Voo be a permutation-invariant probability measure on X°°, 
with restriction Vn to X^. Recall (8.42). For any f G C(X) we have pointwise.- 

•5lx(/) = £(jXjv.va,)(/i). VN-almost surely; (8.88) 

limSiN{f)=E(y V i(/i)i Voo-almost surely, (8.89) 

where the left-hand sides of (8.88) and (8.89) are functions on X^ and X°°, respec¬ 
tively. Furthermore, if Voo = for some jJ. G Pr(7f), then pointwise on X°°, 

limSiNif )= f dptf, -almost surely {f GC{X)). (8.90) 

N^oo ’ Jx 
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Equivalently, if C X°° is the set of infinite sequences (xi ,X 2 ,...) in X°° for which 
the limit in (8.90) exists for each f gC{X) and equals ^^dpLf, then 

= (8.91) 

Proof. Eq. (8.88) is almost trivial, since S\^N{f ) is permutation invariant and hence 
already lies i\\L^{X,S^N, Vn), so that the equality just expresses the projection prop¬ 
erty Eq. (8.89) follows from the ergodic theorem, applied to 

the probability space {X°°,^^oc, Voo), the unilateral shift 

T : (xi,X2,...) H> (X2,X3,...), 

and the random variable /i defined by / G C{X) via (8.87). Since Voo is permutation 
invariant, it is also E-invariant (in the sense that Vco(r^^(A)) = Voo(A) for any A C 
Boo). This follows either directly, where one has to realize firstly that 

X A 2 X •••A„ X •••) = X X Ai X A 2 X ••• X ■■■An x ••• , 

and secondly that tMoo is generated by products with finitely many A; different 
from X, or, more easily, from Corollary 8.11. The (pointwise) ergodic theorem gives 

lim Si^Nif) = E(ag yAfi), Voo-almost surely (/ G C(X)), (8.92) 

where is the CJ-algebra within by the T -invariant sets, and /i G C{X°°) is still 
defined by (8.87). Since C the left-hand side of (8.89) is oS^oo-measurable 

(provided it exists, as we have just shown), eq. (8.89) follows from (8.92). 

If Voo = ft”, then the unilateral shift on X°° is ergodic by Kolmogorov’s 0-1 law, 
and hence the ergodic theorem gives (8.90). Alternatively, if Voo = fl°°, then the 
random variables (/a^), defined by (8.87) with A = 0 °, are i.i.d. (i.e., independent 
and identically distributed) and (8.90) follows from the strong law of large numbers 
(which, coherently, in turn may be derived from the ergodic theorem!). □ 

Note that (8.92) has been proved for / G C(A), but it holds for many other func¬ 
tions, including / = 1^, where A G This gives Borel’s law of large numbers 

lim a?(1a) = fl(A), /^“-almost surely. (8.93) 

Eor example, take A = {0,1} (e.g., a coin toss with outcomes 1 = heads and 0 = 
tails). With/(x) =xin (8.90) or A = {1} in (8.93), writing p = /r({l}), we obtain 

1 ^ 

lim —Yxi= p, u^-almost surely on 2^. (8.94) 

N^OO A 

Equivalently, if Lp C 2^ is the set of infinite binary sequences xiX 2 • • • for which the 
limit in (8.94) exists and equals p, then /r“(Lp) = 1, cf. (8.91). 
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8.4 Frequency interpretation of probability and Born rule 

Results like (8.90), (8.93), and (8.94) give a relationship between the single-case 
probabilities /i(A) or p and the limits of long series of trials on samples drawn ac¬ 
cording to /r or p. Despite the seemingly comforting appearance of < oo on the 
left-hand side, this relationship depends in an essential way on the infinite idealiza¬ 
tion X°°, which is strictly necessary in order to be able to say that the limit (8.94) 
holds almost surely relative to the measure pL°°. This violates Barman’s Principle (cf. 
the Introduction), which is the reason why we prefer the limit (8.49) over (8.93). 

Although these results are mathematically equivalent, both formalizing the idea 
that if (xi,... ,xn) are sampled from X according to some probability measure /r, 
then {I / N)Y]i=i f {xi) converges to J^dp/ as N ^ in (8.49) we never need to 
work with the “actual infinity’’ N = °° and (8.49) holds everywhere on Pr(A) rather 
than almost everywhere on X°°. One reason for this is that in (8.93) etc. the choice of 
the sampling measure p has to be made at the beginning, whereas in (8.49) it only 
comes in at the very end. But it has to made either way, and similarly for any other 
serious effort to relate probability to frequencies in long runs of measurements. 

The extreme delicacy of such efforts is clear from the fact that limiting results 
like (8.90), (8.93), and (8.94) are insensitive to any finite part of the sum, whereas 
any practical use of probability only involves finite trials. As Lord Keynes once said; 

‘In the long run we are all dead.’ 

The founder of the mathematical theory of probability expressed himself likewise: 

‘The frequency concept based on the notion of limiting frequency as the number of trials 
increased to infinity, does not contribute anything to substantiate the applicability of the 
results of probability theory to real practical problems where we have always to deal with a 
finite number of trials.’ (Kolmogorov). 

Moreover, a definition of probability based on e.g. (8.93) is well known to be cir¬ 
cular: although superficially the “almost sure” terminology in the statement of the 
result might instill confidence in the reader, in fact it is an exceptionally strong con¬ 
straint on the sequences (x„) G A" in question that the limit should exist and has the 
right value /r(A), i.e., that (x) G L^, cf. (8.91), and we see that this constraint can 
only be formulated if the single-case probability p was already defined in the first 
place. This shows that the link between probability and frequencies of outcomes of 
long runs of trials only exists and makes sense if single-case probabilities are prior. 

On the other hand, if single-case probabilities are “objective”, as those provided 
by the Born measure in quantum mechanics ought to be at least in remotely realistic 
interpretations of the theory (as opposed to “personal” or “subjective” probabili¬ 
ties construed as “degrees of belief” or “rationality constraints” or whatever other 
decision-theoretic concept in human psychology), then it is hard to say what they 
really mean, since it is precisely about single cases that they do not seem to say 
anything. This brings us to what we propose to call the Paradox of Probability: 
Although single-case probabilities must be logically prior to probabilities construed 
as frequencies, the numerical values of the former have no bearing on single trials 
and can only be validated through their predictions about (finite) frequencies. 
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This paradox imposes the following consistency requirement (which philosophers 
may want to compare with Lewis’s “Principal Principle” that regulates credences): 

The assumption that a single-case probability measure be jj. must imply that the 
probabilities for the various outcomes of long runs of repetitions of identical exper¬ 
iments (provided these are possible) are distributed according to p. 

This describes the relationship between theoretical and experimental physics quite 
well, but still leaves us in the dark as to the meaning of single-case probabilities! 

We are now ready to revisit the Bom rule, which we already discussed from 
a purely mathematical point of view in §§§2.1, 2.5, and 4.1. To repeat the main 
point, if a = a* G B{H) is a bounded self-adjoint operator on a Hilbert space, with 
spectrum (7(a), then any state (O on B{H) defines a unique probability measure /r® 
on C7(a) C M, called the Born measure, such that 

co{f{a))= ( dp^f, fGC{a{a)), (8.95) 

J a{a) 

where /(a) G C*{a) C B{H) is defined through the continuous functional calculus 
(Theorem 4.3). For example, for / = id^j^^j, i.e., the function x x, eq. (8.95) yields 

Co{a) = [ djJLcoi^)^- (8.96) 

J G(a) 

The point of this construction of the Born measure is that it is obtained by simply 
restricting the state O), initially defined on B{H), to its commutative C*-subalgebra 
C*{a). If, in the spirit of (exact) Bohrification, such commutative algebras are iden¬ 
tified with corners of classical physics within quantum theory, one may argue that 
Heisenberg gave the right picture of the origin of probability in quantum mechanics: 

‘One may call these uncertainties objective, in that they are simply a consequence of the 
fact that we describe the experiment in terms of classical physics; they do not depend in 
detail on the observer. One may call them subjective, in that they reflect our incomplete 
knowledge of the world.’ (Heisenberg, 1958, pp. 53-54) 

See, however, §11.1. In any case, there are extensions of this construction to un¬ 
bounded self-adjoint operators as well as to families of commuting self-adjoint op¬ 
erators, to which the following discussion applies, too, mutatis mutandis. 

The Born rule relates the Born measure for a to measurements of a and as such 
is responsible for most predictions of quantum physics, especially in quantum field 
theory, where the connection between theory and experiment mainly involves the 
measurement of cross-sections computed from the Born measure via Feynman rules. 
The Bom rule and the Heisenberg uncertainty relations are often seen as a turning 
point where indeterminism entered fundamental physics. Nonetheless, it is hard to 
say what this Born rule actually states! We made a first attempt in §4.1: 

If an observable a is measured in a state 0), then the probability Pm{a G A) that the 
outcome lies in some measurable subset A C (j(a) C M is given by 

Pa){a G A) = iJ.(o{A). (8.97) 
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Two questions immediately arise: 

1. What is meant by a “measurement” of a (and by its “outcome”)? 

2. What does the “probability” Pmia G A) mean? 

Perhaps these are even the main questions in the foundations of quantum mechanics. 
The first will be taken up in Chapter 11; for now, we simply assume that measure¬ 
ments of quantum-mechanical observables a are defined and have outcomes in a{a). 
The second has just been answered (or some might say evaded): through the Born 
measure, the formalism of quantum mechanics provides numerical values of 
whose mathematical meaning seems unquestionable, and whose operational mean¬ 
ing is given by the predictions they give for outcomes of long runs of repetitions of 
identical experiments. Therefore, all that remains to be done is derive these predic¬ 
tions by analogy with the results in §8.3 for the commutative C*-algebra C{X). 

One such attempt is—in its strengths and its weaknesses—quite analogous to the 
Borel’s law of large numbers (8.93). Although we will soon move to B — B{H), the 
following result is valid for any unital C*-algebra B, with infinite tensor product B°° 
as defined in §C.14 and recalled at the end of §8.2, including the map (p^ ■ B^ B°°. 

Proposition 8.14. If CO G S{B), there is a unique state w” on B°° such that 

M 

of’{q>M{b\®---®bM)) = Y\o}{bn), M G'H,bi,...,bMGB. (8.98) 

n=\ 

Moreover, 0)°° is pure iff (0 is pure. 

This is a special case of Proposition C.105, with C, = B and O), = CO for all i G N. 

We now take B — B{H) for some separable Hilbert space H, some observable 
a = a* G B{H) with spectrum a{a) C K, and some unit vector v G H, with asso¬ 
ciated (normal) pure state cOy in B{H) defined by C0y{b) = {v,bv), and Born mea¬ 
sure = jtlu on (7(a). Now take the corresponding pure state on B{H)°° and 
construct the associated GNS-representation n(o^{B{Hff). The Hilbert space Hio^ 
carrying this representation is an example of an infinite tensor product of Hilbert 
spaces in the sense of von Neumann, which may also be defined directly, as follows. 
Take sequences (xj/n) = ■ ■ ■) with y/n G H satisfying the condition 

^|||V/«||-1|<-; (8.99) 

n 

the rationale behind this condition is that for any sequence (z„) of complex numbers, 
the product ])[«converges and has a nonzero limit iff kn ~ 11 < so (8.99) is 
equivalent to the requirement that ])[« 11'/nil converges to some nonzero value. Fol¬ 
lowing von Neumann, we now introduce the convention that if, for some sequence 
(zn) of complex numbers, Hn kn| converges but W^Zn does not, we define the latter 
to be zero. On this convention, linear and continuous extension of the expression 

{iVn),iVn))=Yl{Vn,V'n)H, (8.100) 

n 
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defines an inner product on the finite linear span of all sequences ()//„) satisfy¬ 
ing (8.99); the complete tensor product H°° is defined as the closure of Hq in the 
ensuing norm. However, this is not the Hilbert space of interest, since it is far too 
large (e.g., it is not separable even if H is). To define interesting separable subspaces 
of H°°, we call sequences ()//„) and (y/') that both satisfy (8.99) equivalent if 

^|(v/„,v/:)-i|<-; (8.101) 

n 

this turns out to be a bona fide equivalence relation. In particular, if (v/„) and (y/') 
are /nequivalent, then {{Wn),{Wn)) = 0- For ™it vector v G H, we now define 
the incomplete tensor product as the closure of the linear span of all sequences 
(V/„) that satisfy (8.99) and are equivalent to u°° (i.e., the sequence (y/') with — V 
for each n), with inner product borrowed from H°° (note that von Neumann’s termi¬ 
nology “incomplete” is somewhat confusing, since is complete as a normed 
vector space and in particular it is a Hilbert space). By construction, u” G H^, and 
it is easy to show that is the closed linear span of all sequences ()//„) that differ 
from B G // in at most finitely many places. We often write (8)n V^h or y/i 0 V /2 ® • • • 
for (</«). Furthermore, for any M gN, any b G B{H) defines a bounded operator 
on by continuous linear extension of 

= b\i/M 0 • • • ■ ( 8 . 102 ) 

This extends to a representation 7t^ of B°° on H^, as follows. Define b^^') G B°° by 

b^^^ =(pM{lH^---^iH^b), (8.103) 

in which l//(8)---(8)l//(8)f>G B^, and (pM ■ B^ -G B°° was defined after (8.58). In 
other words, for b G B{H), the operator is the element of B°° given by the 
equivalence class [ai/atJa^ of the sequence (ai/A/)w with Is in every place except 
fli/Af = b. We then define n^{B°°) by linear and continuous extension of 

= b[^^^ ■ ■ ■ 4?’• (8.104) 

Proposition 8.15. For any unit vector v G H, the GNS-representation 7t(o^{B°°) on 
//(o“ is unitarily equivalent with 7ty{B°°) on H^, under which equivalence the cyclic 
vector f2(o” G corresponds with b” G Fl^. 

Proof. This is a simple consequence of Proposition C.91 and the equality 

®“(fl) = (B~,<(a)B“)/r;^, (8.105) 

initially for a = b^^\ subsequently for a = • ■■b^^^\ and finally, by linearity 

and continuity, for any a G B°°. □ 

In view of this, we will henceforth identify the two Hilbert spaces etc., so that: 
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Ho- = (8.106) 

(8.107) 

^O- = U“- (8.108) 

Recall that 0^{H) is the set of all projections on H, seen as a lattice ordered by 
e < f iff ef = e, which is equivalent to eH C fH, and coincides with the order 
in B(H)sa, cf. Proposition C.170. Also, is the Boolean lattice of Borel subsets 
of (7(fl), ordered by inclusion. For each Borel set A C (^(A) we have an associated 
spectral projection ca G and the map A i —ca defined by the Borel functional 

calculus, i.e.. Theorem B.102, is a lattice homomorphism from ^ to 3^{H). This 
follows because from the perspective of the Borel functional calculus the map A i—^ 
ca is really the map 1^ i— ca, which is the restriction of a homomorphism between 
C*-algebras and hence preserves positivity. Let be the Boolean lattice of Borel 
sets in a{a)°°. As above, take some unit vector v G H, with corresponding 
vector state (On on B{H) and associated state cOy on B(H)°° as defined in Proposition 
8.14, which in turn defines the GNS-representation Tto- of B(H)°° on the Hilbert 
space Ho- - The lattice homomorphism A i—then extends to a homomorphism 

^ (8.109) 

CX3 

AiX---xAmX Y[<^{a) ^ (8.110) 

M+l 

this defines e°° on the basis Borel sets in a{a)°° and extends to all of Realizing 
Ho- as the infinite tensor product cf. (8.106) - (8.108), we rewrite this as 

(ai x---xAmx ncjjfl)) =4 ?u---4mI- (8.111) 

\ M+l ) 

Theorem 8.16. Let a = a* G B{H), let i+y be the Born measure on (7(a) defined by 
some unit vector V G H, and define /jy (8.111). Let (7(a)~ be the set of all points 
in (7{a)°° for which (8.92), or, equivalently, (8.93) holds (with /r /Tu). Then 

e~((j(a)~) = lH^„. (8.112) 

Furthermore, if A C (j(a) is Borel measurable, then, using the notation (8.39), 

limSyNieA)=Bv{A)-lH,.., (8.113) 

in the strong operator topology (i.e., applied to each fixed vector in Ho-}. 

This is the quantum-mechanical law of strong numbers, plus its Borel version. In 
comparison, the strong law of large numbers or Borel’s law of large numbers gives 

liZ{<y{a)Z) = f- (8.114) 
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Proof. For any probability measure fj. on any a-finite compact space X, the corre¬ 
sponding probability measure on is characterized by the property 


/4“’Aix---xAmxAx ]~[ a{a) = 

V M+2 / 


Ai X ••• xAm X FT (7(a) 

V M+l , 


for any M G N and Borel sets A, C X. The measure v on a{a)°° defined by 

v(aiX---xAmX n(7(a) ) (8.115) 

\ M+\ / ^ 

satisfies the above property for { 1 = {ly and hence coincides with In view of this, 
eqs. (C.196) and (8.114) give 


= (8.116) 

For any projection e' and any unit vector \j/' G H' in any Hilbert space H', the prop¬ 
erties = 1, = 1, and e'y/' = y/' are equivalent. Therefore, 




(8.117) 


Consider a vector (8)„ V4 G where only \j/i,... ,\j/K possibly differ from v (K < 
oo). Noting that by (8.106) - (8.107) the right-hand side of (8.115) may be written as 


^Am ) ~ \^^Ai ^Am ) / 


= (4?u 


AmV 


)v"), 


(8.118) 


we modify (8.115) so as to define a new measure v' on a(a)°° by 


v' Ai X • • • X Am X PI (7(a) = ((g)„(44 O • • • O44^) ■ 

V M+l / 


Generalizing the above case of /r°°, the measure v" = x • • • x /J.y,g x H^+i Mu 
on (7°° is characterized by the following two properties: 


V 


// 


j Ai X ••• xA/f X n 

V ^^+1 / 




(8.119) 


v" Ai X ••• X Am X a X n (7(a) =aIu(A)v" Aix---xAmx n c^(a) 


M+2 


M+l 


{M>K), 


( 8 . 120 ) 


and hence v' = v". Therefore, even though v' f we have v'((7(a)'J)’) = 1, since 
membership of O'(a)'^ is entirely defined by the tail of the event. Hence we obtain 
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C (C7(t3)|j) (8.121) 

by the same reasoning as for 1 )°° = f2(o“. Since the linear span of such vectors is 
dense in = //(o” and the projection is bounded, we obtain (8.112). 

To derive (8.113), we use the definition of the Born measure /Tu to find 

II (Si^NieA) - flit (A))t;~ II = ^ (flit (A) - 2n„ (A)2), (8.122) 

which vanishes as A^—oo, so that (8.113) holds on i)°°. A similar computation proves 
(8.113) on vectors ^nWn as above, since the initial K terms where possibly xj/ny^V 
drop out in the limit N ^ Thus we have (8.113) on a dense subspace of //(o”- 
Since the strong limit operator (A) • 1// „ is bounded, this proves (8.113). □ 

An alternative argument shows the mere existence of the limit on the left-hand side 
of (8.113) on the same dense set, upon which the limit operator is seen to commute 
with all local and hence (by norm-continuity) with all quasi-local operators. Since 
COv is pure, so is 0)“, and hence is irreducible. Thus the limit is a multiple of 
the unit, and the coefficient flu (A) then follows from the computation 

lim(t;“,5i,^(eA)0 =fiu(A). (8.123) 

N^OO 

To reduce the level of abstraction and since it is an important case, we now spe¬ 
cialize Theorem 8.16 to a two-level system, i.e., B ~ M 2 (C). In other words, we take 
H = C^, and pick a simple observable a = diag(l,0) with non-degenerate spectrum 
O’(fl) = 2 = {0,1}, so that measurements outcomes are just strings of zero’s and 
one’s. Furthermore, we take a unit vector v = co|0) -|-ci|l), where |0) = (1,0) 
and |1) = (0,1) form the standard basis of C^, and |coP + |cip = 1. We write 
p = |cip. The Bom measure flu on (7(fl) = {0,1} is then given by fiu({l}) = p 
and fiu({0}) = 1 —f>; cf. (2.10) - (2.11). Taking A = {1}, we have ca = |1)(1|- The 
Hilbert space (C^)u is the closure of the finite linear span of vectors of the kind 
y/i (g) y /2 • • • with G and only finitely many possibly different from u. For 
M gN, the operator|l)(l|(^)) sends such a vector to y/i^y/ 2 ---^ (|1)(11 V4f) ® ‘ > 

with all y/n unaffected except forn —M. Eqs. (8.112) - (8.113) then simply read 

= 1(C2)^; (8.124) 

I = '’■>(€■):• »>25) 


where 2“ denotes the set of all infinite binary strings xiX 2 • • • for which x, G 2 and 

(8.126) 

and once again the limit in (8.125) is meant strongly, i.e., the expression on the 
left-hand side must be applied to a fixed vector in (C^)u. 
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Theorem 8.16 forms the (mathematical) culmination of attempts that started in 
1960s to derive the Born rule from other postulates of quantum mechanics, no¬ 
tably the so-called eigenvalue-eigenvector link, according to which a quantum- 
mechanical observable has a definite value if and only if the current quantum state is 
an eigenvector of the associated operator. This link is applied to the state i)°° (or to 
any other state with approximately the same tail) and the operators e°°((7(fl)u) and 
limA^^.,o5i_Ar(e/i). The idea, then, is that according to (8.112), the property expressed 
by the projection e°°(o'(a)y) is certain in the state v°° (for qubits this means that any 
possible infinite string of binary measurement outcomes has average value p). This 
is reinforced by (8.113), which states that the frequency operator for the outcome A 
has a sharp limit equal to ]U(A) (for qubits, with A = {1} this limit is p). 

However, although the mathematics is suggestive, apart from the fact that the 
eigenvalue-eigenvector link itself falls prey to Barman’s Principle (in that sharp 
eigenvalues and eigenvectors are an idealization in a world full of continuous spec¬ 
tra), this particular application of the link makes sense only at N = °°. In this re¬ 
spect, eq. (8.124) has the same drawback as the strong law of large numbers (on 
which its derivation indeed relies), including the fact that attempts to define proba¬ 
bilities through (8.113) or its special case (8.125) are inherently circular. Moreover, 
V°° fails to be an eigenvector of any finite-A^ approximant to (8.125), and by the 
same token, the limit operator defined by (8.125) can only be measured via its in¬ 
dividual contributions 11)(1none of which has u" as an eigenvector; in fact, it 
can be shown that any joint eigenvector of all projections 11) (11 is orthogonal to 
the entire space (C^)u with the complete infinite tensor product (C^)”. 

Problems with Barman’s Principle are avoided if we use Theorem 8.4 (applied 
to B — B{H)) rather than Theorem 8.16: the sequence of operators Si^Ni^A) forms 
a continuous section of the continuous bundle of C*-algebras with fibers (8.50) - 
(8.51), whose limit at = oo, in the sense of (8.46) or (C.560), is given by 

SuooieA) ■■ CO ^ co{A); (8.127) 

recall that Si^oo{eA) G C{S(B{H))). In particular, for pure states co — (Oy we obtain 
the Born probability flu (A). As we have also seen in the commutative case, this limit 
avoids infinite idealizations and other problems with the law of large numbers. 

Prom the point of view of (asymptotic) Bohrification, C{S{B{H))) provides a 
classical description of a long run of identical experiments, which becomes increas¬ 
ingly accurate as A —> oo; this is the whole point of the limits (8.46) and (C.560). In 
particular, the unsound eigenvalue-eigenvector link has been replaced by the role of 
points (0 G S{B{H)) as truthmakers, which is uncontroversial in classical physics. 
If the quantum state in each identical experiment on the given (single) system is (O, 
then the above derivation shows that in the limit N ^ this state acquires a clas¬ 
sical meaning (which according to Bohr would even be the only meaning it has), 
namely as the point in the “classical phase space” S{B{H)) that gives the relative 
frequencies of outcomes of the given long runs of identical experiments. Short of 
deriving the Born rule, this at least provides the reasoning that links the Born mea¬ 
sure (which is canonically given by the theory) to experiment. 
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8.5 Quantum spin systems: Quasi-local C*-algebras 

Beside the Born rule, our second application of the previous formalism is to quan¬ 
tum spin systems, especially to spontaneous symmetry breaking (SSB), see Chapter 
10. Postponing a conceptual discussion of infinite systems in their role of idealiza¬ 
tions of finite systems to the preamble of that chapter, for the moment we just de¬ 
scribe infinite quantum spin systems mathematically. As in §C. 14, we take a Hilbert 
space H, here assumed finite-dimensional, i.e., H = C", and use the standard lattice 
C in dimension d. For any finite subset A C Z'^, i.e., A € we put 

Ha = ®xeAH,-, (8.128) 

= B{Ha) ^ (8.129) 

where Hx= H for each x G A,cf. (C.297) and (C.303). The symbolic notations 

A = <S>xeZ‘>B{H)=lunAAA= U ", (8.130) 

Ae3^f(lA) 

all come down to the same thing—see §C.14, notably (C.323) and (C.317)—and 
define a quasi-local C*-algebra. Elements of each Aa C A are called local observ¬ 
ables, those in the closure of their union are referred to as quasi-local observables. 
Eq. (8.129) defines a map A i-a-Aa, which has three important properties: 


A^(i) CA^( 2 ) if A(') CAP) (Isotony)- (8.131) 

[A^(i),A^( 2 )] = 0 if A^^ A a'^^^ =tb (Einstein locality)-, (8.132) 
A'a = Aj^i (Haag duality), (8.133) 

where A^ in (8.133) is the commutant of A a within A, and, in cute notation, we put 
A' = lfi\A (which is infinite), so that the right-hand side of (8.133) denotes 

A A' = ®xeA' B{H)= U A^d)"", (8.134) 

AWe^f{Z‘>\A) 


which is a C*-subalgebra of A. Since A^^^ C Z^\A(^) whenever A^^^ = 0, 

Haag duality implies Einstein locality (and sharpens it), but it is still worth men¬ 
tioning these properties separately: although in quantum spin systems (8.133)—and 
hence (8.132)—^holds, Einstein locality is a more fundamental property (e.g. it is 
also valid in algebraic quantum field theory, where Haag duality may well fail). 

We now discuss some C*-algebraic concepts that will be needed for the analysis 
of SSB. Through the associated GNS-representation tt® : A B{Ha), any state (O on 
A defines two interesting subalgebras of B{Ha), which a priori may be different: 

• The center A'f,^ = 7t(o{A)” r\7t(o{AY; 

• The algebra at infinity A“ = Dag Tt.(o{AA')"■ 
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Recall that the center of a von Neumann algebra M C B{H) is MflM', and that M is 
called a factor if MflM = C • 1 (cf. §C.21), so is the center of the von Neumann 
algebra na){A)". It is easy to show from Einstein locality that A^ C A^. If each local 
algebra A/^ is simple, Haag duality yields the opposite inclusion, so in that case, 

A“=A^„. (8.135) 

Given (8.129), this applies as long as dim(i/) < in which case also A is simple. 

The algebra at infinity provides a new perspective on the macroscopic observ¬ 
ables in §8.2. Averages like |A| ' JlxeA b{x), where b G B{H), do not have a limit 
in A as A t but (depending on co) their representatives %a{b{x)) 

may have a weak limit in B{H(o). If they do, Einstein locality implies that the limit 
operator lies in algebra at infinity A^ (and hence, assuming (8.135), in A^). If the 
algebra of infinity is trivial (i.e. C • 1//^), macroscopic observables are therefore “c- 
numbers”, i.e., multiples of the unit operator. In particular, they do not fluctuate, 
which is among the defining properties of pure thermodynamic phases. Eormally, 
this idea is captured by the following generalization of the notion of a pure state: 

Definition 8.17. A representation 7l{A) is primary if 7l{A)" fl 7l{A)' is trivial. 

A state CO € 5(A) is primary if the GNS-representation Km is primary. 

Eor compact groups G (or rather their group C*-algebras C* (G)), all representations 
are completely reducible, and a representation is primary iff it is a (possibly infinite) 
multiple of some irreducible representation. However, this is not the right picture for 
general groups or C*-algebras, which requires some discussion. In preparation, we 
call some representation k'{A) on a Hilbert space H' C H a subrepresentation of 
a representation k{A) on H, written k' C K, if k' = K^h'- Subrepresentations k' 
of K correspond to projections e G 7t{A)', such that nfa) = eK{a). It follows that 
Ki (A) and K 2 (A) have equivalent subrepresentations iff there exists a nonzero partial 
isometry w : Hi H 2 such that wK\ (a) = K 2 {a)w for all a GA. 

Definition 8.18. 7vvo representations ;ri and K 2 of a C*-algebra A are called: 

1. equivalent if there is a unitary u : H\ ^ H 2 such that uKi {a)u* = K 2 {a) (a GA); 

2. quasi-equivalent if every subrepresentation of Ki has a subrepresentation that 
is equivalent to some subrepresentation of K 2 , and vice versa; 

3. disjoint if they do not have any equivalent subrepresentations. 

We say that two states Wi and (O 2 on A equivalent, disjoint, or quasi-equivalent if 
the corresponding G'MS-representations Kmi and Ka^ have the said property. 

In other words, Ki and K 2 are quasi-equivalent iff Ki has no subrepresentations dis¬ 
joint from K 2 , and vice versa. This, in turn, is equivalent to the property that the set 
of Ki-normal states on A, i.e. states of the form a 1—Tr (p7r, (fl)) with p G is 

the same for / = 1 as it is for i = 2. Contrapositively, K\ and K 2 are disjoint iff no 
state exists that is both ;ri-normal and ;r 2 -normal. Eor example, taking A = C{X), 
in which case states are probability measures p on X, equivalence and disjointness 
of states recovers the usual notions of equivalence and disjointness of measures, 
respectively (i.e., having the same null sets and having disjoint supports). 
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Proposition 8.19. For any state (0, if (O = t(Oi + (1 — f)ft) 2 /or some t € (0,1), then 
CO\ and (02 are disjoint iff there is a projection e G = 71^ (A)' fi TtaiA)" such that 

^co{A)leH^ ^ TtcoiiA); (8.136) 

^co{A)\,±H^^7tay,iA). (8.137) 

Since subrepresentations of 7t(o{A) always correspond to projections e G TT^jA)'; the 
key assumption being made here is that e also lies in the weak closure TtaiA)” . 

Proof. One direction is easy: if (8.136) - (8.137) hold, then (arguing by contradic¬ 
tion) equivalent subrepresentations ;ri(A) of na^iA) and ;r 2 (A) of Ttca^iA) are given 
by projections ei < e and e 2 <e^ = 1//^ — e , respectively, through 

nfa) = na,{a)ie.H^, (/=l,2,aGA), (8.138) 

and the partial isometry w on //« whose restriction to e\H(o implements a (unitary) 
equivalence between ;ri(A) and ;r 2 (A) by definition satisfies w*w = ei, ww* = e 2 - 
Moreover, e\<e implies we = w and e 2 < e^ implies e^w = w, which together give 
e^we = w. Furthermore, again by definition, w G TTaiA)'. If now e G %(o{A)'' , then 
we = ew. Combining these equalities gives w = 0, which is the desired contradiction. 

Lemma 8.20. For any functional co' G A* such that 0 < (o' < (0, where (0 G S{A), 
there is an operator c G TtmiA)' on Fla such that 0 < c < 1// and 

(o' {a) = {Q.a,c%a{a)Pi(o) {a&A). (8.139) 

In particular, there is a vector ^ G Ha such that 

(o'{a) = {^,na{a)^)H^- (8.140) 

Proof Cauchy-Schwarz for the positive semidefinite form {a,b)' = (o'{a*b) gives 

\( 0 'ia*b)\^ < (o'{a*a)(o'{b*b) < (o{a*a)(o{b*b) = \\7ta,ia)Qa,f\\7tafb)Qa,f. 

Hence we obtain a well-defined positive quadratic form B on Ha, initially defined 
on the dense domain na{A)Qa x ^(o{A)£2a by the formula 

Bi7taia)Qa,7ta{b)na) = (o'{a*b), (8.141) 

and extended to Ha x Ha by continuity; the above inequality immediately gives 
\B{(p,w)\ < ll^llllv^ll, and hence Proposition B.79 yields an operator 0 < c < 1// 
such that B{(p, yr) = {(p,c\lf). With (8.141), this gives (8.139). We now compute 

(0 [a b d) = B[7ta{i^o)i2a,^m(.d)^(o) — {^mio)i2a,^co{i^ )c^(o{d)^a} 

= B{7ta{o)Fia,^m{b d)£2a = {^(o{(l)Fia,C^m{b , 

SO that [c, na{b*)] = 0 for each b €A, i.e., c G na{A)' . Writing c = Cj with c| = ci, 
and then ^ = ciQa, completes the proof. □ 
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We continue the proof of Proposition 8.19 in the converse direction. Assume 

0) = foil+ (1-f)a)2 = ft)i+ ct4, (8.142) 

with (o[ = foil and = (1 — 1)(02, so that 0 < co[ < CO and 0 < < ft). It follows 

from the first claim in Lemma 8.20 that there is c G B(H(o) as stated such that 

ft)j(a) = (8.143) 

( 02 (a) = (1//^ — c)7r(B(fl)f2(u), (8.144) 

where (8.144) follows from (8.143), (C.196), and co = (o[ + ft)^. Define ft)' G A* by 

(o' (a) = {na>,cilHa> - c)n(o(a)£2(o)■ (8.145) 

We have 0 < ft)' < ft){ (since c(l/r^ — c) < c) as well as 0 < ft)' < ft)^ (since also 
^ Now assume that ft)i and (O 2 are disjoint. Applying (8.140) 

with ft) ft), shows that ft)' is ;ri -normal as well as ;r 2 -normal, so that it follows from 
the remarks following Definition 8.18 that ft)' = 0. Since is cyclic for 7tio{A) by 
the GNS-construction, this implies c(l//^ — c) = 0, and hence = c. Since c > 0, 
which implies c* = c, it follows that c is a projection, henceforth called e. Therefore, 

ft)i(fl) = (8.146) 

ft)2(fl) = (8.147) 

where t = We see from these formulae and Proposition C.91 that Tr^j and 

nco 2 are equivalent to the restrictions of to e//® and e^Ha, respectively; under 
this equivalence, the cyclic vectors and Qa >2 correspond with and 

e^Qm/\\e^^ca\\, respectively. Since e G KaiA)' by Lemma 8.20, it only remains to 
be shown that e G KaiA)". To this effect, for any b G K(a{A)' and \j/ G //«, define 

ft)" GA*; 

ft)"(fl) = {e^be\l/,n(o{a)e^be\l/). (8.148) 

Then ft)" is positive, as well as -normal, the latter because of the presence of the 
projection and (8.147). But for a G A+ we have the inequalities 

0< ft)"(a) < \\e^b\\^{e\j/,7t(o{a)e\j/), (8.149) 

so that 0 < ft)" < Cftj' for the state (assuming eyf is a unit vector) 

ft)"(fl) = {\i/,en(o(a)e\l/). (8.150) 

Since ey/ G eHa, the latter state is -normal, so that ft)" is itself -normal by 
Lemma 8.20 (which argument by now should sound familiar). Again invoking dis¬ 
jointness of ft)i and ft) 2 , it follows that ft)" = 0, which, since \j/ was arbitrary, in turn 
yields e^be = 0 for any b G TtmiA)'. This forces e G n(t,{A)". □ 
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The first of the following corollaries to Proposition 8.19 is Hepp’s Lemma: 

Lemma 8.21. Let TT : A —>■ B{H) be a representation of A, and let \j/i, \j /2 be unit 
vectors in H. Then the vector states cofa) = (y/,-, %{a)\pi) (i = l,2j are disjoint iff 

(V/i,7r(a)v/2) =0 (a G A). (8.151) 

Proof Take, for example, 0 ) = j(a)i + 0 ) 2 ) in Proposition 8.19. □ 

Corollary 8.22. 1. Two primary states are either disjoint or quasi-equivalent. 

2. A state is primary iff it has no convex decomposition into disjoint states. 

Recall that a state is pure if it has no nontrivial convex decomposition whatsoever. 
The analogy between pure states and primary states may be completed as follows: 

• (0 pure GG n(t,{A)' = C • 1 (cf. Theorem C.90); 

• (o primary GG n(o{A)' C\n(o(A)" = C -1 (cf. Definition 8.17). 

A physical property of primary states is that the corresponding correlation functions 
have a clustering property of a kind that may even be experimentally accessible: 

Theorem 8.23. A state (0 on a quasi-local C*-algebra A (8.130) has trivial algebra 
at infinity, i.e., A^ = C • 1, iff it is clustering, in the following sense: for each a G A 
and e > 0 there is a finite Adi/ such that for all b G Ay^/ with ||f>|| = 1 one has 

\(o{ab) — (o{a)(o{b)\< e. (8.152) 

In particular, if (0 is primary, then it is clustering and hence (8.152) holds. 

Proof. The complete proof is quite technical, but the main idea is as follows. Choose 
finite regions A„ moving to infinity (i.e., eventually avoiding any given A), and pick 
elements c„ G Ayi„), ||c„|| = 1. The sequence {na){c„)) in B{Hof) has a weakly con¬ 
vergent subsequence with limit c G B{Hof). This follows from the Banach-Alaoglu 
Theorem B.48, applied to B{Hof) seen as the dual space of B/Ho/f)'. on the unit 
ball, the corresponding weak*-topology on B{Hof) coincides with the weak operator 
topology, so that the unit ball in B{Ha) is weakly compact and the theorem applies. 

• By von Neumann’s Bicommutant Theorem C.127 we have c G nio{A)”. 

• By Einstein locality (8.132) and the delocalization of the A„, also c G TtmiA)'. 

Hence c G A^, and by a more refined argument (which is unnecessary if if A” = A^), 
even c G A^. So if A^ = C • 1 we have c = {QmjcLio}) ■ 1. On the other hand, 

{Qo!>,cQa>) =lim(i2(a,;?rft,(c„)i2(B) = lim a)(c„), 

n n 

so that we may compute: 

lim(B(flc„) =\im{Qoi>,7ta,{a)n(o{cn)^m) = {Lla,'ita{a)cQ.(o) = CO (a) lim a)(c„). 

n n n 

Thus for any e > 0 there is an N such that \ (o(ac„) — (o{a)(o{cn)\ < £ for all n > N. 
To derive (8.152) from this, an easy reductio ad absurdum argument suffices. 

The converse direction follows from Kaplansky’s Density Theorem C.131. □ 
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8.6 Quantum spin systems: Bundles of C*-algebras 

In this section we reformulate the theory of quantum spin systems in the continuous 
C*-bundle language of §8.2. First, for each N gN we define An € by 

An = {xGZ‘^\\\x\\<N}. (8.153) 

We then have the following analogue of the continuous bundle of C*-algebras 
of C*-algebras of Theorem 8.8. The base space remains / = 1/N C [0,1], where 
N = {1,2,..., oo} (seen as possible values of 1 /K), and the fibers are given by 

-n-11-11 

Ao = A = hm^A^^ = [J Aa^v ; (8.154) 

A^In=Aa^=B{Ha^) {NGn), (8.155) 

cf. (8.128) - (8.130), still assuming dim{H) < As before, the topology of this 
bundle is defined through its continuous cross-sections which are the 

analogues of the quasi-local sequences of Definition 8.7. Given (8.154) - (8.155), 
each fiber algebra Aj/^ is a subalgebra of Aq, and some sequence {oi/N)NeN sii^ply 
defines a continuous cross-section of the bundle iff within A (i.e. in norm) we have 

limai/A = ao- (8.156) 

In other words, a sequence {ai^N)Nen with ai/N G ^i/N C A is quasi-local in the 
sense of Definition 8.7 iff it converges in A (i.e., iff it is Cauchy in the norm of A). 

The continuous bundle of Theorem 8.4 makes equally good sense for quantum 
spin systems. First, with B = B{H) = Mn{C), the fibers are obviously given by 

A^^'^ =C{S{B{H)))- (8.157) 

A^^j^=B{Ha^). (8.158) 

Second, the continuous sections are once again specified via symmetrization maps 

Sm.n-B{Ha^)^B{Ha^), (8.159) 

defined similarly to (8.39), namely via canonical symmetrizers 


Sn : B{Ha^) ^ B{Ha^) (8.160) 

that are defined a la (8.35) - (8.36), where this time the tensor product and ensuing 
permutation in (8.35) are over all sites x G An- Regarding ai/M G B{Ha^) as an 
element of B{Haj^) via the embedding Aa,^ we finally define Sm,n by 

SM,Niai/M) = SNia'i/M)- (8.161) 
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Symmetric and quasi-symmetric sequences may then be defined exactly as in 
Definitions 8.2 and 8.3; each quasi-symmetric sequence (ai/ 7 v)wGi 4 duly has a limit 

ao G given by (8.46), where is defined as in (8.47), once again with a tensor 
product over all sites x G An- By definition, the continuous sections of the bundle 
(8.157) - (8.158) are then given by the quasi-symmetric sequences. 

Although the fibers A in (8.154) and C{S{B{H))) in (8.157) are as wide apart as 
they could possibly be, they stunningly arise as limit algebras at = 0 (i.e., N = °° 
or A = Z'^) for the same fiber algebras (8.155) and (8.158) at > 0 (i.e., N <°° or 
A G ^/(Z'^)). As in §8.2, the difference lies in the choice of the topology on the 
bundle, defined via the continuous sections, which in the first case are the quasi-local 
sequences, and in the second are the quasi-symmetric (i.e., macroscopic) ones. 

An interesting connection between these bundles can be obtained via the follow¬ 
ing concept, which in a way justifies the introduction of the bundles themselves. 

Definition 8.24. A continuous field of states on a continuous bundle of C*-algebras 
with fibers (Ai/Af)Ari=N is a family where 


®i/w G (8.162) 

lim = a)o(ao), (8.163) 

N^OO ' ' 

for each continuous cross-sections In that case, we write 

(Oo = lim 0 ) 1 /^, (8.164) 

despite the fact that all states in question may be defined on different C*-algebras. 
For example, any state o on Aq = A as in (8.154) defines a continuous field; 
Proposition 8.25. For any state CO G 5(A), the set of states defined by 

coo = co; (8.165) 

(Oi/N = (8.166) 

is a continuous field of states on the bundle with fibers (8.154) - (8.155). 

Proof We use the notation of Definition 8.7. For local sequences (8.57) we have 


®l/A'(ai/A') — ®(ai/A;) — ®(ai/M)) 


for all A > M. Since ao = this equals a)o(ao). For quasi-local sequences, oq is 
the limit of the sequence (aj/^) in the norm of A, so that (a(ai/^) —>■ (o{ao). □ 

Definition 8.26. A state co G 5(A) is macroscopic if Co(ai/^) exists for any 

(quasi-) symmetric sequence (oj/^). 

It does not matter whether we put “symmetric” or “quasi-symmetric” here, since 
existence of the limit for symmetric sequences implies its existence on quasi- 
symmetric sequences. Indeed, using the fact that ||(a|| = 1, we may estimate 
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l®(ai/Ar) — Cd(fli/M)| < |Cd(fli/Ar) — a)(fli/Ar)| 

+ Ilki/At — al/A^|| + ||ai/M ~ai/Mll) (8.167) 

for any sequence (ai/^). Using Definition 8.3, and hence taking (ai/^) symmetric, 
we see that if is a Cauchy sequence, then so is ((B(ai/^)). 

Proposition 8.27. A macroscopic state (0 determines a state CO^ '^ on C{S{B)) by 

O)(f\ao) = co{ai/N), (8.168) 

where is any quasi-symmetric sequence with limit oq G C{S{B)), cf (8.46). 

Proof. First, note that is independent of the choice of the approximating se¬ 
quence since by the same argument as in the proof of Proposition C.126, if 

fli/^ —>■ flo as well as —>■ ao, we have 

^iin ||ai/A,-a'i/^|| = ||ao-ao|| =0, (8.169) 

and because ||a)|| = 1 for any state O), we also have 

\(o{aiif^ — a^/ff)\ < ||ai/^ — ajy^ll. (8.170) 

Eqs. (8.169) - (8.170) obviously imply 

lim (oiauf^) = lim ©(a',/^). (8.171) 

We next show that if ai/^ —ao and ^ in the sense of (C.560), then 


fll/W^l/A' Oobo- 

If (oi/Af) is a symmetric sequence a la (8.43), and likewise (Iji/at), where we may 
assume without loss of generality that M is the same for both, then 

flo(p)=p"(ai/M), (8.172) 

where p € S{B), and likewise for bo. Using (8.38), we obtain 

p^(ai/A,Pi/A,) = p"(ai/M)p"(^i/M) = ao{p)bo{p) = {aobo){p). (8.173) 

In particular, if —>■ ao, then a^^^aj/^ —>■ a^ao. Since cu is a state, it follows 

that Wq (a^ao) > 0, and since also cOq '(15(B)) = 1 (because the sequence with 
‘^i/N = converges to 1s(b(h)))’ the claim follows for symmetric sequences. 
For quasi-symmetric sequences (ai/ff) the result follows by approximating (ai/ff) 
with symmetric sequences (cf. Definition 8.3). □ 
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Each state Wq ' G ‘^(-^0 ) represented by a probability measure fj. on the state 
space S{B(H)) of B{H). We compute this measure if O) G S{A) is permutation- 
invariant in that each restriction ©i/Af = is invariant under the natural 

action of the permutation group S|ajv| — ^xeAifB{H), where G N 

and |Aa?| is the number of points in (as in the case of B°° in §8.2). It fol¬ 
lows from the Quantum De Finetti Theorem 8.9 (and the fact that that the set 
5®°°(A) of permutation-invariant states on A is a so-called Bauer simplex) that each 
permutation-invariant state to G 5®°°(A) takes the form 

(0=1 dp{p)p’", (8.174) 

JS(B(H)) 

where p is some probability measure on S{B{H)), and p G (//)); the associated 
state p°° on A is defined by its values on each Ay^j^ C A via the isomorphism 

AA^^®xeA^B{H). (8.175) 

Furthermore, the integral in (8.174) is defined weakly, i.e., for any a G A the number 
(o{a) is obtained by integrating the function p p°°{a) on S{B{H)) with respect to 
p. In particular, (O G (9^5®“(A) iff p is a Dirac measure on S{B(H)). 

Proposition 8.28. Fflc/z permutation-invariant state (0 G 5®°°(A) is macroscopic 
(cf. Definition 8.26), and the probability measure p on S{B{H)) defined by COq'^ 
via (8.168) coincides with the one appearing in (8.174). 

Proof. Fet be a symmetric sequence (the quasi-symmetric case follows from 

this), so that = Sm,n{oi/m) for some M whenever N > M, cf. (8.43). The limit 

flo G C{S{B{H))) is given by (8.172), so that state on C{S{B{H))) defined by 

[ dp{p)f(p) (8.176) 

satisfies the required condition 

lim (OuM{auN) = (OuM{auM)= dp(p)p’^{ oum) = (ol^‘'\ao). □ 

N^oo ' ' ' ' JS{B{H)) ' 

To proceed we make the following technical assumption on (O G S{A) (which is 
satisfied in typical physical models): if nco{aiiAi) -A- 0 weakly in B{Hof), for some 
sequence where oi/at G Aij^, then 0 in B{Haf) (in norm). 

Theorem 8.29. Assume that the state CO in part 1 below (and likewise the states COi 
and (02 in part 2) satisfies the above technical condition. Then: 

(e) 

1. If CO is a primary macroscopic state on A, then the corresponding state CO^ ^ is 
pure, i.e., the probability measure p on S{B{H)) is a Dirac measure. 

2. If (0\ and (O 2 are quasi-equivalent primary macroscopic state on A, then pi = P 2 
(and hence if Pi f P 2 , then Oi and (O 2 are disjoint). 
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The techniques in the proof below can be used to show that our additional assump¬ 
tion is equivalent to: if (8.178) below holds weakly in B(Ha)), then it also holds 
strongly. Thus we could have redefined a macroscopic state (O as one for which the 
strong limit limA^^.>o nio{aiij^) exists in B{Hio) (and some authors indeed do so). 

Proof. We first show that if Cd is a primary macroscopic state on A, and (aiffj) is 
symmetric (from which the quasi-symmetric case duly follows) such that 

lim cdfaw^) = a, (8.177) 

N^OO ' 

then, in the weak operator topology on the GNS-representation space B{H(o), 

lim 7r(o(ai/A,) = a-1//^. (8.178) 

N^OO 

To this end, we first note that ||fli/Ar|| is uniformly bounded in N: if is sym¬ 

metric, as in (8.43), then obviously ||ai/Ar|| = IIoi/mII for all > M, so that if 
is merely quasi-symmetric we have ||fli/A^|| < Hoi/mII +£ for all N > M, where e 
andM are the quantities appearing in Definition 8.3. Hence it is enough to establish 
the weak limit (8.178) between states in a dense set, viz. 7rft)(A)f2(o, where b G A, 
or even in Furthermore, using the polarization identity (A.5) and (C.8) - 

(C.9), it is enough to prove that for each /f G N and A G we have 

lim co{b*auNb) = aco(b*b), (8.179) 

N^OO ' 

since by the GNS-construction we obviously have 

{7t(o{b)Q(o,7t(o{ai/i^)7t(o{b)Q(o) = (offa^jj^b). (8.180) 

Theorem 8.23 implies (or even states) that if co is primary, for each b G A and e > 0 
there is M G N such that for all a G A^^ with ||fl|| = 1, we have 

\co{b*ba) — co{b*b)co{a)\ < e. (8.181) 

Assuming b G we first note that limAr^cc[ai/Ar, A] = 0 in norm (even though 
does not exist in norm), and secondly that, for any given M G N, if 
is the same as except that in any term Aj 0 • • • 0 b\AN\ contributes to 
we replace bt \h whenever bt G A^jf^, then 

hhn ||ai/A,-ai/A,|| =0. (8.182) 

Given (8.177), these facts with (8.181) immediately give (8.179) and hence (8.178). 
According to (8.177) and (8.178), the state G S{C{S{B{H)))) is given by 

(OoHao) = lim {Qa„7t(o{ai/N)Q(o), (8.183) 
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where ai/^^ is some symmetric sequence converging to —ao in the sense of (C.560); 
as in the proof of Proposition 8.27, the left-hand side is independent of the particular 
choice of this sequence. The proof of Proposition 8.27 also showed that if ao 

and ^ bo, then —>■ so that 

(Oq (aobo) = ^im {£2(o,7t(i){aiii^biij^)£2(o) 

= ^ii^ — cc ■ lHa))'^(o{bi/]^)i2(o) ccji, 

where a is defined by (8.177), and likewise j3. At this point that we need our ad¬ 
ditional assumption, which, together with uniform boundedness of ||7r(i)(ai/^)|| and 
hence of ||;rra(ai/^)f2ft)|| in N yields that the first term in the second line is zero. 

Therefore, co^'^ is multiplicative and hence pure (cf. Proposition C.14). 

To prove the second claim, first suppose (0i and ©2 are quasi-equivalent. In that 
case, up to unitary equivalence, either tToj, is a subrepresentation of or vice 
versa-, assume the former. We then have a projection e G (A)' such that 


■Ka)^{a)=e%a>2{a), (8.184) 

for each a GA, and since e = 1//^^ by construction, eq. (8.178) gives 

^im Ttcoi (ai/Af) = «i • e; (8.185) 

lim nay^{auf^) = a2-lHoy,- (8.186) 

AT—>.00 ' 4 

Multiplying both sides of (8.186) with e gives ai = a 2 . □ 


Corollary 8.30. A permutation-invariant state (0 G 5®“(A) is primary ijf the cor¬ 
responding measure p in (8.174) is a Dirac measure, and it is pure ijf the latter is 
supported by a pure state on B{H). 

Proof. In the first claim, the inference from “primary" to “Dirac” obviously follows 
from Theorem 8.29. The converse direction is a consequence of the commutation 
theorem (C.329) for von Neumann algebras, combined with the fact that each rep¬ 
resentation of B{H) for finite-dimensional H is primary (which in turn follows from 
the fact, not proved in this book, that B{H) has just one irreducible representation, 
up to equivalence). The second claim follows from Proposition C.105. □ 

Finally, one macroscopic state generates many others. A folium in the state space 
5(A) of a C*-algebra A is a convex, norm-closed subspace of 5(A) with the 
property that if © G and b G A such that co{b*b) > 0, then the “reduced” state 
(Oh : a i-G (o{b*ab) /(o{b*b) must be in . For example, if ;r is a representation of 
A on a Hilbert space H, then the set of all density matrices on H (i.e. the ;r-normal 
states on A) comprises a folium In particular, each state © on A defines a folium 
through its GNS -representation Kod- It then follows from cyclicity of the 
GNS -representation that each state in the folium of a macroscopic state © G 5(A) 

is automatically macroscopic and even has the same limit state ©("^^ as ©. 
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Notes 


§8.1. Large quantum numbers 

Theorem 8.1 has been adapted from Landsman (1998b); the proof relies on Si¬ 
mon (1980), who, generalizing the case of SU (2) treated by Lieb (1973), in turn uses 
the coherent states for Lie groups introduced by Perelomov (1972, 1986). Duffield 
(1999) gives the details of the method of steepest descent used in proving (8.30). 
Although this material was inspired by Bohr’s Correspondence Principle, at the end 
of the day the relationship may seem remote. 

§8.2. Large systems 

The theory in this section, which elaborates on Landsman (2007), is a reformula¬ 
tion in terms of continuous bundles of C*-algebras of the formal parts of a series of 
papers on quantum mean-field systems by Raggio & Werner (1989, 1991), Duffield 
& Werner (1992a,b,c), and Duffield, Roos, & Werner (1992). These models have 
their origin in the treatment of the BCS theory of superconductivity due to Bogoli- 
ubov (1958) and Haag (1962); for further references see the notes to §10.8. 

§8.3. Quantum de Finetti Theorem 

Theorem 8.9 is due to Stprmer (1969), whose proof was based on the fact that 
the ©.>o-action on B°° is asymptotically abelian, in that for any a,a' G B°° one has 


infill[ap(fl),fl']||,p G 6oc} = 0. 

This implies that (B“) is a Choquet simplex, which quickly leads to (8.66). Our 
proof is taken from Hudson & Moody (1975). See also Caves, Fuchs, & Schack 
(2002a). Finite-size corrections to Theorem 8.9 are studied e.g. in Konig & Mitchi- 
son (2009). Corollary 8.11 is due to Hewitt & Savage (1955), who credit Jules Haag 
(rather than De Finetti) for the binary case (i.e., X — {0,1}). See Kallenberg (2005) 
for an exhaustive account of such results (in classical probability theory). 

Proposition 8.12 is taken from Diaconis & Freedman (1980), who also give 
finite-size corrections to Corollary 8.11, as follows. Let a permutation-invariant 
probability measure Vn on be JG-exchangeable, so that there is a permutation- 
invariant probability measure Vm+k on X^^^ whose restriction to X^ is Vn- Let 
Pn+k be the probability measure on Pr(Jf) defined by V^+k as in (8.85), i.e., 
Pn+k{A) = VN+K{E^lKi^)), and finally define 

'^'n+k= f dPN+K{B) , 

JPr{X) 


as in (8.79). Then, in terms of the usual norm on the Banach dual C{X^)*, 


llViV 


v;il< 


N 


Proposition 8.13 is stated without proof in Kingman (1978). See Mackey (1974) or 
Gray (2009) for ergodic theory in connection with probability theory. 
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Of course, there are numerous results in probability theory that do not share the 
problems of the law of large numbers. For example, in the situation (8.94), for any 
e > 0 one has the Chernoff-Hoeffding bound 



which is superior to the weak law of large numbers, i.e., for every e > 0, 



which from the point of view of Barman’s Principle is already a marked conceptual 
improvement over the strong law (but which is mathematically weaker). 

§8.4. Frequency interpretation of probability and Born rule 

The Kolmogorov quote is from Fine (1973, p. 94), which even 40 years later is 
still to be recommended as one of the best (technical) book on the foundations of 
probability theory. See also Hajek & Hitchcock (2016) for a comprehensive recent 
survey of the philosophy of probability. The Keynes quote is from Hacking (2001, 
p. 149), which is a very elementary introduction to the foundations of probability 
At a more advanced level see also Gillies (2000), whilst Howson (1995) is a useful 
brief survey. 

The original version of the Principal Principle (Lewis, 1980) equated probabil¬ 
ity (or chance) as subjective degree of belief (i.e. credence) with objective chance 
(though in the single case as opposed to relative frequency. Our own version in the 
main text is meant to clarify the relationship between singe-case probabilities and 
long run frequencies, both seen as objective. 

Attempts to derive the Bom rule started with Finkelstein (1965) and were contin¬ 
ued e.g. by Hartle (1968), Farhi, Goldstone, & Gutmann (1989), Van Wesep (2006), 
Aguirre & Tegmark (2011), Moulay (2014), and others, partly based on indubitable 
mathematical arguments in the spirit of the strong law of large numbers supplied 
by e.g. Ochs (1977, 1980), Bugajski & Motyka (1981), Pulmannova & Stehlkova 
(1986). Such attempts (typically presented as claims) provoked valid critiques of the 
kind mentioned in the main text from e.g. Cassinelli & Sanchez-Gomez (1996) and 
Caves & Schack (2005). For a balanced account see also Cassinelli & Lahti (1989). 
Infinite tensor products of Hilbert spaces were introduced by von Neumann (1938). 

Our approach, which is sympathetic to both sides of the dispute, is a vast ex¬ 
pansion of Landsman (2008). The existence of e°° as in (8.109) - (8.110) is based 
on the same extension argument that proves the Kolmogorov existence theorem for 
infinite product probabilities, see e.g. Dudley (1989), proof of Theorem 8.2.2, and 
Van Wesep (2006), who carries out the proof for X = {0,1}. 

There is also a large (and inconclusive) literature on alleged derivations of the 
Born mle in the context of the Many-Worlds (i.e. Bverettian) Interpretation of quan¬ 
tum mechanics, which may be traced back from Wallace (2012), who supports such 
derivations, and Dawid & Thebault (2015), who criticize them. 
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§8.5.Quantum spin systems: Quasi-local C*-algebras 

Basic references are Ruelle (1969), Israel (1979), Bratteli & Robinson (1987, 
1997), and Simon (1993); for macroscopic states see Hepp (1972) and Sewell 
(2002). Naaijkens (2013) is a useful brief introduction to quantum spin systems. 

The proof that Haag duality holds for quantum spin systems is far from trivial: see 
Simon (1993), Prop. IV.1.6. In the proof of (8.135), simplicity of A given simplicity 
of each Ayi is easily inferred from the fact that if / C A is an ideal, then — IC\A^ is 
an ideal in A^ = which must be either zero or Ayi, both of which contradict 

non-triviality of I. Theorem 8.23 is a famous result due to Lanford & Ruelle (1969), 
partly anticipated by Powers (1967). For a complete proof see also Simon (1993), 
Theorem IV. 1.4. 

§8.5.Quantum spin systems: Bundles of C*-algebras 

This section was inspired by Landsman (2007), §6, and Gerisch (1993). 

Folia of states (in the sense meant here) were introduced by Haag, Kadison, & 
Kastler (1970), but note that the name “folium” is poorly chosen, since S{A) is by 
no means foliated by its folia (for example, a folium may contain subfolia). 





Chapter 9 

Symmetry in algebraic quantum theory 


In §3.9 we defined symmetries of classical physics as symmetries of either Poisson 
manifolds or Poisson algebras; these notions are equivalent. At the bare level of the 
underlying phase space X, merely seen as a locally compact space (rather than a 
Poisson manifold), the key result establishing this equivalence is this: 

Theorem 9.1. Let X and Y be locally compact Hausdorjf spaces. Each isomorphism 
(X : Co(Y) ^ Cq{X) is induced by a homeomorphism (p :X -^Y via a = (p* (and so 
each automorphism ofCo{X) is induced by a homeomorphism ofX). 

More generally, if A and B are commutative C*-algebras, then each isomorphism 
a : A ^ B is induced by a homeomorphism (p : X{B) —>■ X{A) of the corresponding 
Gelfand spectra via a = otp* o Ga, where Ga : A —>■ Co(Z(A)) is the Gelfand 
ismomorphism, cf (C.79), and similarly for B (and so each automorphism of A is 
induced by a homeomorphism of its Gelfand spectrum X{A)). 

This immediately follows from Theorems C.8 and C.45, and Corollary C.48. 

In Chapter 5 we saw that even in elementary quantum mechanics, where A = 
B{H) for some Hilbert space H, the concept of a symmetry is more diverse, as least 
apparently, since a non-commutative C*-algebra like B(H) gives rise to numerous 
“quantum structures”. The ones we looked at were listed after Proposition 5.3, viz. 

1. The normal pure state space d^\ (H), dressed with a transition probability (2.44). 

2. The normal (total) state space &{H), seen as a convex set; see Theorem 2.8. 

3. The self-adjoint operators B{H)sa, on H, seen as a Jordan algebra. 

4. The effects S’{H) = [0, 1]b{//) on H, seen as a convex poset. 

5. The projections 3^{H) on H, seen as an orthocomplemented lattice. 

6. The unital commutative C*-subalgebras ‘rf{B{H)) of B{H), seen as a poset. 

Each structure comes with its own notion of a symmetry, see Definition 5.1. This 
raises two questions, which for B{H) were completely answered in Chapter 5: 

• The possible equivalence of the various notions of quantum symmetry; 

• Unitary implementability of symmetries. 

Indeed, it was found that if dim(//) > 2, then all these notions of symmetry are 
equivalent, as well as unitarily implementable a la Wigner; see Theorem 5.4. 
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9.1 Symmetries of C*-algebras and Hamhalter’s Theorem 

In this chapter we generalize this analysis from A = B{H) to arbitrary C*-algebras 

A, which for simplicity we assume to have a unit 1^. See §C.25 for terminology. 

Definition 9.2. Let Abe a unital C*-algebra. 

1. The pure state space P{A) = dgS{A) is the extreme boundary of the state space 
5(A), seen as a uniform space equipped with a transition probability 

'z{( 0 ,(o') = inf{cu(fl) I a G A,0 <a<lA, (o' {a) = 1}. (9.1) 

A Wigner symmetry of A is a uniformly continuous bijection W ; P{A) ^ P{A) 
with uniformly continuous inverse that preserves transition probabilities, i.e., 

x(fN{(o)\N{(o')) = x{(o,(o'), (o,(o'gP{A). (9.2) 

If A = B{H), Proposition C.177 guarantees that the above expression reproduces 
the standard quantum-mechanical transition probabilities (2.44), but compared 
to this special case, one novel aspect of P{A) is that all pure states are now taken 
into account (as opposed to merely the normal ones, which notion is undefined 
for general C*-algebras anyway). Another is that in order to obtain the desired 
equivalence with other structures, the set P{A) should carry a uniform structure, 
namely the w*-uniformity inherited from A*. 

2. The state space 5(A) is the set of all states on A, seen as a compact convex set in 
the w* -topology inherited from the embedding 5(A) C A*. A Kadison symmetry 
of A is an affine homeomorphism K ; 5(A) —>■ 5(A). 

Compared to A = B{H), firstly all states are now taken into account (instead of 
all normal states), and secondly we have added a continuity condition on K. 

3. Any C*-algebra A defines an associated Jordan algebra ( more precisely, a JB- 
algebra), namely Asa equipped with the commutative product aob = \ {ab A ba). 
A Jordan symmetry J of A is a Jordan isomorphism of {Asu, o) (or, equivalently, 
an invertible unital linear isometry of (Asa, || • ||), which in turn is the same as 
a unital linear order isomorphism of {Af,^,<f), cf Lemma C.173). A weak Jor¬ 
dan symmetry of A is an invertible map J : Asa Asa whose restriction to each 
subspace Csa ofAsu, where C G ‘^{A), is linear and preserves the Jordan product. 

4. The effects in A comprise the order unit interval S'(A) = [0, 1a], i.e., the set of 
all a G Asa such that 0 < a < Ia, seen as a convex poset in the obvious way. A 
Ludwig symmetry of A is an affine order isomorphism L : S(A) —>■ S{A). 

5. The projections S^(A) in A form an orthomodular poset (cf. Definition D.l) with 
c < / iff^f = c and = 1a — e; if A is a von Neumann algebra (cf. Proposition 
C.136), or more generally an AW*-algebra or a Rickart C*-algebra (see %C.24), 
^^{A) is even an orthomodular lattice. A von Neumann symmetry of A is an 
isomorphism N ; d^(A) —>■ 3^(A) of orthomodular posets. 

6. The poset 'S(A) (lying at the heart of exact Bohrification) consists of all commu¬ 
tative C*-subalgebras of A that contain the unit Ia, partially ordered by inclu¬ 
sion. A Bohr symmetry of A, then, is an order isomorphism B : IS(A) —>■ ‘S{A). 
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The structures 1, 2, 3 (with Jordan symmetries), and 4 are equivalent; see Theo¬ 
rem C. 179 for 1 o 2 and Theorem C. 172 for 2 o 3; the equivalence 3 O 4 is proved 
in exactly the same way as in Proposition 5.21, with Lemma 5.20 for the special case 
A = B(H) replaced by Lemma C.173 (which has the same proof). From 1-4 we pick 
the Jordan algebra structure of A, since it gives the most straightforward results. 

Henceforth, A and B are unital C*-algebras, and we define a weak Jordan iso¬ 
morphism of A and B as an invertible map J : Asa 7?sa whose restriction to each 
subspace Csa of Asa, where C € “^(A), is linear and preserves the Jordan product 
o (so that a Jordan symmetry of A alone is a weak Jordan automorphism of of A). 
Such a map complexifies to a map Jp : A —> B in the usual way, i.e. writing a G A 
as a = b + ic, with b* =b and c* = c, cf. (C.9), and put Jc(a) = J(^) + /J(c)). If no 
confusion arises, we just write J for Jc- We first turn to Bohr symmetries. 

Proposition 9.3. Given a weak Jordan isomorphism J : Asa Bsa. the ensuing map 
B : ‘^{A) —>■ ‘^{B) defined by B(C) = Jc(C’) = J(C) is an order isomorphism. 

Note that as an argument of B the symbol C is a point in the poset “^(A), whereas 
as an argument of Jc it is a subset of A, so that Jc(C) stands for {Jc(c) | c G C}. 

Proof. The restriction J|c : C B is a homomorphism of C*-algebras on each com¬ 
mutative C*-algebra C C A (although J : A —B may not be). Since J|c is injective 
on Csa (where it coincides with J), it is also injective on C. Hence J|c is isometric 
by Theorem C.62.3, so that its range is closed and therefore J(C) is a commutative 
C*-algebra in B, which is unital if C is. Trivially, if C C D in A (so that C < D in 
'^(A)), then J(C) C J(D) in B (so that J(C) < J(D) in “^(B)). □ 

The converse, however, is a deep result, which we call Hamhalter’s Theorem: 

Theorem 9.4. Let A and B be unital C*-algebras and let B : ^{A) -G to{B) be an 
order isomorphism. Then there is a weak Jordan isomorphism J : Asa B^a such that 
B = Jc- Moreover, if A is isomorphic to neither G? nor M 2 (C), then J is uniquely 
determined by B, so in that case there is a bijective correspondence J o B between 
weak Jordan symmetries J of A and Bohr symmetries B of A. 

Before proving this, let us explain why and M 2 (C) are exceptional. In the first 
case, = { 0 , 1 } (with 0 = C -12 and 1 = C^), which admits just one order iso¬ 

morphism (viz. the identity map), which is induced by both the map {a,b) 1 — {b,a) 
and by the identity map on (each of which is a weak Jordan automorphism). 

In the second case, the poset ‘rf{M2{C)) has a bottom element 0 = C • I2, as 
before, but no top element; each element C C • I2 of '^(M2(C) is a unitary conju¬ 
gate of the diagonal subalgebra D 2 (C), with 0 < C but no other orderings. Further¬ 
more, C n C' = C • 1 2 whenever Cf^C'. Hence any order isomorphism of ’if {M 2 (fC)) 
maps C • I2 to itself and permutes the C’s. Thus each map J ; M2(C)sa —>■ Af2(C)sa 
whose complexification Jc : M 2 (C) —M 2 (C) shuffles the C’s isomorphically (as 
C*-algebras) gives a weak Jordan automorphism. For example, take {a,b) {b,a) 

on Z) 2 (C) and the identity on each C 7 ^ Z) 2 (C)); this induces the identity map on 
‘^(M 2 (C). It follows that there are vastly more weak Jordan automorphisms of 
M 2 (C) than there are order isomorphisms of ‘rf{M 2 {C)). 
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Proof. The key to the proof lies in the commutative case, which can be reduced to 
topology. If A = C{X), any C (A) induces an equivalence relation on X by 

x-cyiff/W=/(y)V/GC. (9.3) 

This, in turn, defines a partition X = \_\j^KxQfX (henceforth called n), whose blocks 
Ki CX are the equivalence classes of To study a possible inverse of this proce¬ 
dure, for any closed subset K CX we define the ideal 

Ik = C{X-K) = {/ G C{X) I f{x) = OVx G K}, (9.4) 

mC{X), and its unitization Ik = Ik®C •lx, which evidently consists of all continu¬ 
ous functions on X that are constant on K. If X is finite (and discrete), each partition 
n of X defines some unital C*-algebra C C C(X) through 

C= f] Ik,, (9.5) 

K,eK 


which consists of all / G C(X) that are constant on each block of the given 
partition n. In that case, the correspondence C gg n, where n is defined by the 
equivalence relation in (9-3), gives a bijection between '^(C(X)) and the set 
fp(X) of all partitions of X. For example, the subalgebra C = Ik corresponds to the 
partition consisting of K and all singletons not lying in K. Given the already defined 
partial order on '^(C(X)) (i.e., C < D iff C C D), we may promote this bijection to 
an order isomorphism of posets if we define the partial order <' on to be the 
opposite of the natural one < in which n <n' (where n and n' consist of blocks 
{Kx} and respectively) iff each is contained in some K'^, (i.e., n is finer 

than n'). The partial ordering <' makes a complete lattice, whose top element 
consists of all singletons on X and whose bottom element just consists of X itself; 
the former corresponds to C{X), which is the top element of ‘^(C(X)), whilst the 
latter corresponds to C- lx, which is the bottom element of ^{C{X)). 

For general compact Hausdorff spaces X, since C{X) is sensitive to the topology 
of X the equivalence relation (9.3) does not induce arbitrary partitions of X. It turns 
out that each C G '^(C(X)) induces an upper semicontinuous partition (abbreviated 
by u.s.c. decomposition) ofX, i.e., 

• Each block Kx of the partition n is closed; 

• For each block Kx of n, if Kx QU for some open U G ff{X), then there is 
V G ff(X) such that Kx ^ V C U and V is a union of blocks of n (in other 
words, if K is such a block, then V C\K = % implies K — 0). 

This can be seen as follows. Firstly, if we equip n with the quotient topology with 
respect to the the natural map q : X ^ n , x ^ Kx if x G Kx, then n is compact, for 
X is compact. Moreover, n is Hausdorff. To see this, let Kx and K^ be two distinct 
points in n. Recall that x,y G Kx if and only if f{x) = f{y) for each f G C. Since 
Kx f K^, there is some xG Kx, some y GK^ and some / G C such that f{x) f fiy), 
whence there are open disjoint U,V CC such that f{x) G U and f{y) G V. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



9.1 Symmetries of C*-algebras and Hamhalter’s Theorem 


337 


Define / : tt — C by f{Kx) = f{x) for some x & Kx-^y definition of Kx, this is 
independent of the choice of x & Kx, hence / is well defined. Again by definition, 
we have f = f oq, hence q^^ which is open in X since / is 

continuous. Since K is equipped with the quotient topology, it follows that [f/] 
is open in n, and similarly is open. Moreover, we have f{Kx) = f{x) and 

f{x) € U, hence Kx G and similarly, K^ G / '[1^]- We conclude that 7t is 

also Hausdorff. Since ^ is a continuous map between compact Hausdorff spaces, it 
follows that q is closed. It is a standard result in topology that q is closed iff tt is a 
u.s.c. decomposition, so we have now proved the latter. 

Consequently, by the same maps (9.3) and (9.5), the poset ‘^(C(A)) is anti¬ 
isomorphic to the poset of all u.s.c. decompositions of X in the natural or¬ 
dering < (which proves that ^{X) is a complete lattice, since '^{C{X)) is). This is 
still a complicated poset; assuming X to be larger than a singleton, the next step is to 
identify the simpler poset ,^ 2 {K) of all closed subsets of X containing at least two 
elements within 5^(X), where (as above) we identify a closed K CX with the (u.s.c.) 
partition Ttx of X whose blocks are K and all singletons not lying in K (note that the 
poset ^(X) of all closed subsets of X is less useful, since any singleton in ^X) 
gives rise to the bottom element of 5^(X)). To do so, we first recall that j3 is said to 
cover a in some poset if a < j3, and a < y < P implies a = 7 . If the poset has a 
bottom element, then its covers are precisely its atoms. Furthermore, note that since 
the bottom element 0 of consists of singletons, the atoms in are the par¬ 
titions of the form (where xi ^ xx). It follows that some partition tt G SX) 

lies in .^xX) iff exactly one of the following conditions holds: 

• TT is an atom in SX)’ i-^-’ ^ — ^{^ 1 x 2 } i"®** some xi ,X 2 GX,xi ^ xx', 

• n covers three (distinct) atoms in in which case n — 7t^xiX2.s2} where allx, 

are different, which covers the atoms Tt^xixx}’ ^{^ 1 x 3 }’ ^{^ 2 -^ 3 }’ 

• If a j3 are atoms in such that a <n and j5 < n, there is an atom 7 < ;r 

such that there are three (distinct) atoms covered by a V 7 and three (distinct) 
atoms covered by j3 V 7 . In that case, n = Kk where K has more than three el¬ 
ements: if a = ^{xi,x 2 } ~ ^{^ 3 x 4 }’ ’■i'® assumption a ^ fi, 

the set {xi,X 2 ,X 3 ,X 4 } (which lies in K) has at least three distinct elements, say 
{xi ,X 2 ,X 3 }. Hence we may take 7 = t:^x 2 ,x 2 }’ in which case a V 7 = 7:^xi,x2X3}’ 
which covers the atoms a, 7 , and Likewise, we have j3 V 7 = ^^{^ 2 x 3 x 4 }’ 

which covers three atoms j3, 7 , and tt^x 2 X 4 }- 

In order to see that n satisfying the third condition must be of the form Ttx, assume 
the converse. So n contains two blocks Kx and K^ consisting of two or more el¬ 
ements. Say {xi,X 2 } C Kx and {x 3 ,x 4 } C K^. Then a — t:^xi,x 2 } P{x 3 X 4 } 

atoms such that a,j3 < n, and there is an atom 7 = Tt^x^xe} — ^ '^h^^e are 

three atoms covered by a V 7 , and there are three atoms covered by j3 V 7 . It follows 
from the second condition that a V 7 = ;ri with L a three-point set. This implies that 
{xi,X 2 }n{x 5 ,X 6 } is not empty, from which it follows that a V 7 = Tt^xixxXsXe}- 
ilarly, we find j3 V7= 7r{jc3,;t4.4:5,jc6}- Since {xi ,X2,X5,X6} and {x 3 ,X 4 ,X 5 ,X 6 } overlap, 
we obtain a V j3 V 7 = 7t{A:i,jc2X3X4X5X6}' Moreover, a,j3, 7 < n, so a V j3 V 7 < n. 
However, since xi, X 2 G , we must have {xi, X 2 , X 3 , X 4 , X 5 , xg } C by definition of 
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the order on But since X 3 ,X 4 G K^, we must also have {xi,X 2 ,X 3 ,X 4 ,X 5 ,X 6 } C 
K^, which is not possible, since and Kj^ are distinct blocks, hence disjoint. We 
conclude that n can have only one block K of two or more elements, hence n = Kk- 

Thus C has been characterized order-theoretically. Moreover, 

% = y^exTlK(x), (9-6) 

where K{x) is, the unique block of X that contains x. Hence ,^2 {X ) determines ). 

Let X and Y be compact Hausdorff spaces of cardinality at least two (so that 
the empty set and singletons are excluded). By the previous analysis, an order 
isomorphism B : ‘^(C(X)) —>■ '^(C(F)) is equivalent to an order isomorphism 
—>■ jj(y), which in turn restricts to an order isomorphism ,^ 2 iX) -G ^ 2 (T). 

Lemma 9.5. IfX and Y are compact Hausdorjf spaces of cardinality at least two, 
then any order isomorphism F ; ^ 2 {X) -G =^ 2 (T) is induced by a homeomorphism 
(p :X ^Y via f{F) — (p{F), i.e., f{F) = UxeF{(pix)}- Moreover, ifX and Y have 
cardinality at least three, then (p is uniquely determined by F. 

To see the idea, we first prove this for finite X, where ^ 2 {X) simply consists of all 
subsets of X having at least two elements, etc. It is easy to see that X and Y must 
have the same cardinality |X| = |T| = n. If n = 2, then ,^ 2 iX) = X etc., so there is 
only one map F, which is induced by each of the two possible maps (p :X ^Y, so 
that (p exists but fails to be unique. If n > 2, then F must map each subset of X with 
n — 1 elements to some subset of Y with n — 1 elements, so that taking complements 
we obtain a unique bijection (p : X ^ Y. To show that (p induces F, note that the 
meet A in t^ 2 {X) is simply intersection fl, and also that for any F G ,^ 2 (X), 

F = 'LixeF{x\ = r\x^F{x} = {x}) , (9.7) 

where A‘^ = 2f\A. Since F is an order isomorphism, it preserves A = fl, so that 
F(T') = F^xiF^iixY) = = iUx^F{(p{x)}y = UxeF{(pix)}. (9.8) 

Now assume that X is infinite. Let x G X. If x is not isolated, we define (p{x) 
as follows. Let 0’{x) denote the set of all open neighborhoods of x. Since x is not 
isolated, each O G ^(x) contains at least another element, so O G ,^ 2 (X). More¬ 
over, finite intersections of elements of {O : O G ^(x)} are still in ^ 2 {X). In¬ 
deed, if G ^(x), then Oi H ... fl 0„ is an open set containing x, and 

since Oi fl .. .nO„ C fl ... fl 0„, it follows that Oi fl ... nO„ G ,^2(X). Since 
F is an order isomorphism, we find that finite intersections of {F(6>) : O G ^(x)} 
are contained in ,^ 2 (T). This implies that {F(6>) : O G ^(x)} satisfies the finite 
intersection property. As Y is compact, it follows that f = noeO’{x) F((9) is non¬ 
empty. We can say more: it turns out that 4 contains exactly one element. Indeed, 
assume that there are two different points yi,y 2 G 4- Then {yi,y 2 } G ^2(T), so 
F^'({yi,y 2 }) G ,^ 2 (X). Since {yi,y 2 } G F((9) for each O G ^(x), we also find that 
F^'({yi,y 2 }) L O for each O G ^(x). This implies that 
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F-‘({ti,T2 })C n 0 = {x}, (9.9) 

Oe^ix) 

where the last equality holds by normality of X. But this is a contradiction with F ; 
^ 2 {X) —>■ being a bijection. So 4 contains exactly one point. We define (p{x) 

such that {^(x)} = 4. Notice that (p{x) cannot be isolated in Y, since if we assume 
otherwise, then F \ {^(x)} must be a co-atom in ^ 2 {Y), whence (F \ {^(x)}) 
is a co-atom in ^ 2 {X), which must be of the form X \ {z} for some isolated zG X. 
Since x is not isolated, we cannot have x = z, so X\ {z} is an open neighborhood 
of X, which is even clopen since z is isolated. By definition of (p{x), we must have 
^(x) G F(X \ {z}), but F(X \ {z}) = F \ {^(x)}. We found a contradiction, hence 
(p (x) cannot be isolated. Now assume that x is an isolated point. Then X \ {x} is a co¬ 
atom in ^ 2 {X), so F(X \ {x}) is a co-atom in J^ 2 {Y), too. Clearly this implies that 
F(X \ {x}) = F \ {y} for some unique y GY, which must be isolated, since F \ {y} 
is closed. We define (p{x) = y. 

In an analogous way, F^' induces a map Xj/ :Y —X. We shall show that (p and 
\j/ are each other’s inverses. Let x G X he isolated. We have seen that ^(x) must be 
isolated as well, and that ^(x) is defined by the equation F(X \ {x}) = F \ {^(x)}. 
Since F is an order isomorphism, we have X \ {x} = F^* (F \ {^(x)}). Since (p{x) 
is isolated, we find by definition of \j/ that y/((p(x)) = x. In a similar way we find 
that (p(y/(y)) = y for each isolated y G F. Now assume that x is not isolated and let 
F G ,^ 2 (X) such that x G F. Then 

W} = n ^ O open,F C 0} 

Oeff{x) 

= F(n{0;(9open,TC(9}) =F(T), (9.10) 

where the last equality follows by completely regularity of X. The penultimate 
equality follows from the following facts. Firstly, the set • O open,/^ C 0} 
is closed since it is the intersection of closed sets. Moreover, the intersection con¬ 
tains more than one point, since F contains two or more points and F CO for each 
O. Hence • O open,/^ C 0} G ,‘^ 2 {^), and since F is an order isomorphism, 
it preserves infima, which justifies the penultimate equality. Hence ^(x) G F(T') for 
each F G ^ 2 (X) containing x. Since x is not isolated, (p{x) is not isolated either. 
Hence in a similar way, we find that \j/{(p{x)) G F^*(G) for each G G ^ 2 {Y) con¬ 
taining (p{x). Let z = \l/{(p{x)). Combining both statements, we find that z G F for 
each F G ^ 2 {^) such that x G F. In other words, z G G G F}. Since 

X is not isolated, we each O G ff(x) contains at least two points. Hence 

G : X G F} C p|{C>: O G ^(x)} = {x}, (9.11) 

where we used complete regularity of X in the last equality. We conclude that z = x, 
so \i/((p{x)) = X. In a similar way, we find that (p(\i/{y)) = y for each non-isolated 
y G F. We conclude that ^ is a bijection with inverse <p^^ = Xj/. 





340 


9 Symmetry in algebraic quantum theory 


Continuing the proof of Lemma 9.5, we have to show that if F G ^ 2 {X), then 
(p[F] = F{F). Let X G F. In the proof that ^ is a bijection we already noticed that 
(p{x) G F(F) if X is not isolated. If x is isolated in X, then we first assume that F 
has at least three points. Since {x} is open, G = F\ {x} is closed. Since F contains 
at least three points, G G ^ 2 {X). So G is covered by F in ^ 2 {X), so F(L') covers 
F(G). It follows that there must be an element ye G L \ F(G) such that 

fiF) = F(GU {x}) = F(G) U {yc}. (9.12) 

Both GU {x} and X \ {x} are elements of ), so 

F(G) = F(GU{x}nX\{x}) = F(GU{x})nF(X\{x}) 

= (F(G)U{yG})n(F\{(p(x)}), (9.13) 

where F(2f \ {x}) = F \ {^(x)} by definition of values of (p at isolated points. Since 
x^ Gand F preserves inclusions, this latter equation also implies F(G) CF\{^(x)}. 
Hence we find 

F(G) = (F(G) U {ye}) n (F \ {(p(x)}) = F(G) U ({ye} n F \ {(p(x)}). (9.14) 

Thus we obtain {yc} HF \ {^(x)} C F(G), but since yg ^ F(G), we must have 
^(x) =yG- As a consequence, we obtain F(F) = F(G) U{^(x)}, so (p{x) G F{F). 

Summarizing, if F has at least three points, then ^(x) G F(L) forx G F, regardless 
whether x is isolated or not. So C F(L) for each F G ^ 2 {X) such that F has at 
least three points. Let F G •^ 2 {X) have exactly two points. Then there are Fi,F 2 G 
^ 2 {X) with exactly three points such that F = FiC\F 2 . Then since ^ is a bijection 
and F as an order isomorphism both preserve intersections in ,^ 2 (A), we find 

(p[F] = (p[Fi nF 2 ] = (p[Fi]n(p[F 2 ] c F(T’i)n F(T 2 ) = F(T’i nF 2 ) = f{F). (9.15) 

So (p[F] C f{F) for each T" G ,^2(2f). In a similar way, we find [G] C F^^ [G] for 
each G G ,^ 2 (F). So if we substitute G = F(F), we obtain [F(T’)] C F. Since (p 
is a bijection, it follows that F(T') = for each F G ,^ 2 (A). As a consequence, (p 
induces a one-one correspondence between closed subsets of X and closed subsets 
of F. Hence ^ is a homeomorphism. This proves Lemma 9.5. □ 

The special case of Theorem 9.4 where A and B are commutative now follows if 
we combine all steps so far: 

1. The Gelfand isomorphism allows us to assume A = C{X) andB = C(F), as above. 

2. The order isomorphism B : ^(A) '^(B) determines an order isomorphism F : 

S(X) -G 5^(F) of the underlying lattices of u.s.c. decompositions, and vice versa. 

3. Because of (9.6), the order isomorphism F in turn determines and is determined 
by an order isomorphism F : .^2 (A) —>■ ,^2 (F). 

4. Lemma 9.5 yields a homeomorphism (p :X inducing F : ,^2 (A) —,^ 2 (F). 

5. The inverse pullback : C{X) C(F) is an isomorphism of C*-algebras, 

which (running backwards) reproduces the initial map B : ‘^(C(A)) —>■ ‘^(C(F)). 
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Therefore, in the commutative case we apparently obtain rather more than a weak 
Jordan isomorphism J : Aga -4 Bsai we even found an isomorphism J : A —B of C*- 
algebras. However, if A and B are commutative, the condition of linearity on each 
commutative C*-subalgebra C of A includes C = A, so that (after complexification) 
weak Jordan isomorphisms are the same as isomorphisms of C*-algebras. 

We now turn to the general case, in which A and B are both noncommutative (the 
case where one, say A, is commutative but the other is not, cannot occur, since “^(A) 
would be a complete lattice but '^(B) would not). Let D and E be maximal abelian 
C*-subalgebras of A, so that the corresponding elements of “^(A) are maximal in the 
order-theoretic sense. Given an order isomorphism B : “^(A) —(B), we restrict the 
map B to the down-set),D = ^{D) in “^(A) so as to obtain an order homomorphism 
B|£) : '^{D) —>■ ^{B). The image of under B must have a maximal element 

(since B is an order isomorphism), and so there is a maximal commutative C*- 
subalgebra 5 of B such that Bj^,: (D) —^(D) is an order isomorphism. Applying 

the previous result, we obtain an isomorphism ■ D ^ D of commutative C*- 
algebras that induces B|/). The same applies to E, so we also have an isomorphism 
Je ■ E ^ E of commutative C*-algebras that induces B|£. Let C = DflB, which lies 
in ^(A). We now show that Jo and coincide on C. There are three cases. 

1. dim(C) = 1. In that case C = C • 1^ is the bottom element of “^(A), so it must be 
sent to the bottom element C = C • 1^ of ^{B), whence the claim. 

2. dim(C) = 2. This the hard case dealt with below. 

3. dim(C) > 2. This case is settled by the uniqueness claim in Lemma 9.5. 

So assume dim(C) = 2. In that case, C = C*{e) for some proper projection e G 
^(A), which is equivalent to C being an atom in “^(A). Recall that all our C*- 
algebras are unital, and that by assumption C*-subalgebras C share the unit of 
the ambient C*-algebra A, hence C*(e) contains the unit of A. Hence C = B(C) = 
B|o(C) = B|o(C) is an atom in ‘^(B), which implies that C = C*(e) for some pro¬ 
jection e G 3^{B). If j£)(e) = j£(e) we are ready, so we must exclude the case 
= e, j£(e) = Ifi — e. This exclusion again requires a case distinction; 


dim(eAe) = dim(e^Ae^) = 1; 

(9.16) 

dim(eAe) = 1, dim(e^Ae^) > 1; 

(9.17) 

dim(eAe) > 1, dim(e^Ae^) > 1, 

(9.18) 


where = 1^ — e. Each of these cases is nontrivial, and we need another lemma. 
Lemma 9.6. Let C G^{A) be maximal (i.e., C G A is maximal abelian). 

1. For each projection e G i^(C) we have dim(eCe) = 1 ijfdim{eAe) = 1. 

2. We have dim(C) = 2 iff either A = or A = M 2 {C). 

Proof. For the first claim dim(eAe) = 1 clearly implies dim(eCe) = 1. For the con¬ 
verse implication, assume ad absurdum that dim(eAe) > 1, so that there is an a G A 
for which eae f^X-e for any A G C. If also dim(eCe) = 1, then any c G C takes the 
form c = jl- e + e-^ce-^ for some jj. gC. Indeed, since c,e,e^ commute within C, 
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c = ce + ce^ = +c(e^)^ = ece + e^ce^ = jxe + e^ce^, (9.19) 

where the last equality follows since ece G eCe, which is spanned by e. This implies 
that eae G C (where C' is the commutant of C within A), and since C is maximal 
abelian, we have C = C', whence eae G C. Now eae = e{eae)e, hence eae G eCe, 
whence eae = X- e for some A G C. Contradiction. According to Theorem C. 169.1, 
the assumption dim(C) = 2 implies that A is finite-dimensional, upon which Theo¬ 
rem C.163 and (C.641) yield the second claim. □ 

Having proved Lemma 9.6, we move on the analyze the cases (9.16) - (9.18). 

• Eq. (9.16) implies that C is maximal, as follows. Any element a G A is a sum 

of eae, e^ae^, eae^, and nonzero elements of C' = {e}' can only be of 

the first two types. If (9.16) holds, then dim(C') = 2, but since C is abelian we 
have C C C' and since dim(C) = 2 we obtain C' = C. Lemma 9.6.2 then implies 
that either A = or A = M 2 (C). These C*-algebras have been analyzed after 
the statement of Theorem 9.4, and since those two A’s conversely imply (9.16), 
we may exclude them in dealing with (9.17) - (9.18). By Lemma 9.6.2 (applied 
to D and E instead of C), in what follows we may assume that dim(D) > 2 and 
dim(£’) > 2 (as D and E are maximal). 

• Eq. (9.17) implies dim(eD) = 1. Assuming j£)(e) = e, this implies dim(eZ)) = 1 
(since Jo is an isomorphism). Applying Lemma 9.6.1 to B gives AimieBe) = 1 
(since D is maximal). If also dim((lB — e)B{\B — e)) = 1, then dim(5) = 2, 
whence dim(D) = 2, which we excluded. Hence 

Axm{{\B-e)B{\B-e))>l. (9.20) 

Applied to Se this gives J£(e) = e, and hence Jo and Jo coincide on C = C*{e). 

• Eq. (9.18) implies that dim(eDe) > 1 as well as dim(e^£’e^) > 1 (apply Lemma 
9.6.1 to D and E). Since dim(eDe) > 1, there is some a G D such that e and 
a' = eae G D are linearly independent, and similarly there is some b G E such 
that b' = e^be^ is linearly independent of e^. Then a',b',e commute (in fact, 
a'b' = b'a' = 0), so that we may form the abelian C*-algebras Ci = C*{e,a') C D 
and C 2 = C*{e,b') C E, which (also containing the unit l^i) both have dimension 
at least three. We also form C 3 = C*{e,a',b'), which contains Ci and C 2 and 
hence is at least three-dimensional, too. Because D and E are maximal abelian, 
C 3 must lie in both D and E. Applying the abelian case of the theorem already 
proved to D and E, as before, but replacing C used so far by C 3 , we find that J/j 
and Se coincide on C 3 (as its dimension is > 2). In particular, j£)(e) = J£(e). 

To finish the proof, we first note that Theorem 9.4 holds for A = B = C by in¬ 
spection, whereas the cases A = B = or = M 2 (C) have already been discussed. 

In all other cases we define J : Asa —^ Bsa by putting J(a) = J£)(a) for any max¬ 
imal abelian unital C*-subalgebra D containing C = C*(a) and hence a; as we just 
saw, this is independent of the choice of D. Since each Se) is an isomorphism of 
commutative C*-algebras, J is a weak Jordan isomorphism. Einally, uniqueness of 
J (under the stated restriction on A) follows from Lemma 9.5. □ 
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Theorem 9.4 begs the question if we can strengthen weak Jordan isomorphisms 
to Jordan isomorphism (i.e. invertible linear maps that preserve the Jordan product, 
cf. Appendix C.25). This hinges on the extendibility of weak Jordan isomorphisms 
to linear maps (which of course continue to preserve the Jordan product and hence 
are automatically Jordan isomorphisms). A general result in this direction is: 

Theorem 9.7. Let A and B be unital AW*-algebras, where A contains no summand 
of type l 2 . Then there is a bijective correspondence between order isomorphisms 
B : ‘^{A) — >■ '^{B) and Jordan isomorphisms J : Asa Bsg,. 

This follows from Gleason’s Theorem for AW*-algebras, which we will neither 
state nor prove. If A — B ~ B{H), then the ordinary Gleason Theorem suffices to 
yield the crucial lemma for Wigner’s Theorem for Bohr symmetries (i.e. Theorem 
5.4.6): 

Lemma 9.8. Let H be a Hilbert space of dimension greater than two. Then any Bohr 
symmetry of^{B(H)) is induced by a Jordan symmetry ofB{H)sii. 

Proof This follows from Theorem 9.4 and Corollary 5.22, which for the case at 
hand turns weak Jordan isomorphisms into Jordan isomorphisms. □ 

We finally turn to symmetries of projection lattices. Theorem C.174 shows that 
for von Neumann algebras (and more generally for AW*-algebras) A (without sum¬ 
mand of type I 2 ) and B, any isomorphism N : ,f^{A) 3^{B) of the correspond¬ 

ing orthocomplemented projection lattices (which automatically preserves arbitrary 
suprema) is the restriction of a unique Jordan isomorphism J : Aga —Bsa- 

This completes the argument to the effect that for many C*-algebras of observ¬ 
ables A (including B(//) for dim(//) > 1 as far as nos. 1^ are concerned, and having 
dim(//) > 2 if we also include nos. 5-6) our six seemingly different notions of sym¬ 
metry of a quantum system described by a C*-algebra are equivalent. In particular, 
they are equivalent to Jordan isomorphisms, which are also the easiest ones to use, 
as they involve a readily identifiable part Ag^ of A, and (by complexification, as ex¬ 
plained above) may even be defined on A itself (namely as those complex-linear 
isomorphisms that preserve the involution * as well as the Jordan product o). 

Putting B—A and assuming (without loss of generality) that A C B{H), Theorem 
C.175 then yields a separation of Jordan automorphisms into three disjoint classes: 

Corollary 9.9. J is a Jordan symmetry of a unital C*-algebra A C B{H), then 
there are three mutually orthogonal projections ei, e^, e$ in A' fl A" such that: 

1. ei-\-e2 + e^ = 1 //; 

2. The map a 1 —>■ ]{a)e\ from A to B{e\H) is a homomorphism (of C*-algebras); 

3. The map a 1 —>■ j(a)e 2 from A to B(e 2 H) is an anti-homomorphism (ibid.); 

4. The map a 1 —^ J(fl)e 3 from A to B(e 2 ,H) is both a homomorphism and an anti¬ 
homomorphism of C*-algebras (so that the “corner” J(A)e 3 is commutative). 

If in addition a J(a)ei is not an anti-homomorphism and a 1 —^ J(a)e2 A not a 
homomorphism, then ei, e 2 , and ej are uniquely determined by these conditions. 

As we shall now see, if the symmetries form a (Lie) group, then this result often 
justifies restricting our attention simply to homomorphisms of C*-algebras. 
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9.2 Unitary implementability of symmetries 

There are good reasons for the dichotomy (or even trichotomy) between homo- 
morphisms and anti-homomorphisms of C*-algebras left by Corollary 9.9, since 
in physics certain discrete symmetries of quantum theory indeed give rise to anti- 
homomorphisms: the best-known examples are time inversion T and charge con¬ 
jugation C combined with space inversion (i.e. parity) P, giving CP (there are also 
other examples in condensed matter physics, like quantum spin flip). However, for 
the kind of problems mainly addressed in this book it is sufficient to restrict our 
attention to homomorphisms. One reason is that even if we use discrete symmetries 
(where the simplest non-trivial group Z 2 often suffices to make our point), the mod¬ 
els we treat simply realize these symmetries as homomorphisms. Another reason 
is that if symmetries join to form a connected topological group G (typically a Lie 
group) and the maps xi—tJx sending x G G to some Jordan symmetry Jx of the given 
C*-algebra A of observables form a (strongly) continuous homomorphism (see be¬ 
low), then the identity e G G must be mapped to the identity id^i, which of course is 
a homomorphism of A. Continuity then implies that all ^x must be homomorphisms. 

In what follows we therefore assume that G is a (topological) group and that we 
are given a (continuous) homomorphism x 1 —a;ifrom G into the group Aut(A) of all 
automorphisms of A; note that, given our restriction to homomorphisms, we switch 
notation from J to the customary symbol a. Continuity here always means strong 
continuity, in that for each a G A the map x 1 — (Xx{a) from G to A is continuous (so 
that the map G x A —>■ A given by (x,a) 1 —>■ Ctf(fl) is continuous, as usually required 
for group actions in a topological setting, cf. Proposition 5.35). 

It follows from Theorem 5.4 (technically, from part 4 of that theorem, but 
“morally” from all of it, including the equivalences between all kinds of symmetries) 
that if A = B{H), then a homomorphism a:G—^ Aut(B{H)) is always implemented 
by a family u (x) of unitary operators on H, in that 

ax{a) = u(x)au{x)* (xGG). (9.21) 

The group representation property a^cty = does not enforce u{x)u{y) = Uxy\ 
indeed, as we saw in detail in §5.10 one may have a projective unitary representation 
g I— u{x) of G on H. However, by Theorem 5.62 one may usually pass to a central 
extension G of G for which this problem does not arise (e.g., 5G(3) = SU{2)). In 
Corollary 9.12 below (unbroken symmetry), even such a passage is not necessary. 

For general C*-algebras A—especially those modeling either classical systems 
(in which case A is commutative) or infinite quantum systems (where A is typically 
an infinite tensor product), one rarely has a{a) = mu* for some m G A even for 
single automorphisms a, let alone for a whole group of them. Instead, we settle for 
a weaker notion of unitary implementability, where the unitary u need not be in A. 

Definition 9.10. Let %: AB{H) be a representation of A. An automorphism a G 
Aut(A) is implemented in H if there exists a unitary operator u : H H such that 

7l{a{a)) = u7l{a)u* (aCA). (9.22) 
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The fundamental criterion for implementability uses the pullback a* : S{A) S{A) 
of a : A —> A to the state space 5(A), defined hy a*CO — CO o ; cf. §C.25. 

Theorem 9.11. An automorphism a : A ^A can be implemented in the GNS-repr- 
resentation Km defined by a state CO on A ijf%a*m ond TCm are unitarily equivalent. 

Proof. Whether or not na*m and Km are unitarily equivalent, we may define 


Ha*co-, (9.23) 

w7t(o{a)i2(o = 7ta*(o{ct{ci))^a*a)- (9.24) 

This operator is well defined and unitary, and satisfies w£2m = Pla*({> as well as 
wn(o{a)w* = 7Ca*(o(cc(ci))', these properties even characterize w. If 7Ca*co — ^m, there 
exists a unitary v : Ha —> Ha*(o satisfying v7ta(a)v* = 7ta*(oio), aGA. Then u = v*w 
satisfies (9.22) for n — Ka- The converse is similar. □ 

An important special case arise if co is invariant under a. 

Corollary 9.12. If a*co — co (that is, Co(a{a)) = Co(a) for all a G A), then a is 

implemented by a unitary operator Ug, : Ha -G Ha satisfying UaPIm = In par¬ 

ticular, given a continuous homomorphism a ; G —^ Aut(A) such that a*CO = CO for 
each X G G, one has a family of unitaries Ua{x ): Ha -4 Ha that for allx G G satisfy 

ua{x)Qa = i2(o; (9.25) 

na{cXx{a)) = Ua{x)7ta{a)ua{x)*, (9.26) 

and form a continuous unitary representation ofG on Ha. 

Proof. One easily checks that the following operators do the job; 

Ua(x)Kaia)Q,a = ^m{(^x{n))Pia- D 

Given some a G Aut(A), a weak form of spontaneous symmetry breaking 
(SSB) is that some state co — it is always a state that breaks a symmetry—satisfies 
a*CO CO', a stronger one states that the two equivalent conditions in Theorem 9.11 
are violated, i.e., that a cannot be implemented in the GNS -representation 7ta{A) 
(cf. Definition 9.10). In order to be physically relevant, the weaker notion has to be 
supplemented with additional structure, which also guarantees that genetically the 
weak form implies the strong one. Part of this structure involves the identification of 
suitable classes of states within which we define SSB; these classes are predicated 
on a time-evolution on A. We also need a symmetry group instead of a single auto¬ 
morphism a (which implicitly uses the group 'Lp = 'LIp-l^, where p is the smallest 
integer such that = id^i; if no such p exists the group is just Z). Thus we need; 

• A C*-algebra A with time-evolution, i.e., a homomorphism a ; M Aut(A); 

• A preferred class of states defines via a, viz. ground states or equilibrium states', 

• A symmetry group GactingonA via a homomorphismy; G—> Aut(A) satisfying 

Ck7g = 7gat (fGM,gGG). (9.27) 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



346 


9 Symmetry in algebraic quantum theory 


9.3 Motion in space and in time 

The C*-algebras A we are going to use are the quasi-local ones introduced in §8.5 
for quantum spin systems; especially recall (8.130). Also, the C*-algebra A = B°° in 
§8.2 is a case in point, but this would require some changes in what follows. The last 
expression in (8.130) is convenient for introducing spatial translation symmetry 

T : ^ Aut(A) (9.28) 

of Z'^, as follows: for x G Z*^, define Zx ■ ^x+a initially by 

't:x{b{y)) =b{x + y), (9.29) 

where, for given b G B{H) and y G A, the operator b{y) G A a is the element 
with Qy = b and = 1// whenever z^y- Since arbitrary elements of Aa are (norm- 
limits of) finite linear combinations of products of such operators b{y), the automor- 
phic (and hence isometric) property of Zx defines its action on all of A a (if necessary 
by continuous extension). Note that for a G Aa the operator Zx{a) thus defined is in¬ 
dependent of the (typically non-unique) realization of a in terms of the b{y), because 
Zx is an isometry. The group homomorphism property of the map (9.28) thus con¬ 
structed is guaranteed by (9.29), whilst continuity is no issue since Z'^ is discrete. 

Since A a = <SiyeAAy with Ay = B{H), an equivalent way to define Zx is to use 
identifications id),y: Ay -G- Aj (since Ay = Aj = B{H)), which, taking tensor products, 
yield isomorphisms idy^y^/ : Ay\ — Aa' whenever some bijection A = A' is given. 
In terms of those, we simply have {'tx)\Aj^ = idA^+A- Either way, the maps {Zx)\a^ 
extend to : A —A by continuity. The following property then holds: 

Proposition 9.13. An automorphic action z of 7/ on a quasi-local C*-algebra A is 
asymptotically abelian in the sense that \imx^oo[a, Zx{b)\ = 0 for all a,b GA. 

Here x ^ means that any sequence (x„) with |x„| oo with respect to the Eu¬ 
clidean norm on Z'^ has a subsequence {x'fj for which the stated result holds. 

Proof Eor a and b local, i.e., a G A^(i) and b G A^( 2 ) this follows from Einstein 
locality. The general case follows by approximating a and b by local elements. □ 

Thus quasi-local C*-algebras A satisfy the assumptions in the following theorem, 
which will be important in linking the various notions of SSB discussed earlier. 

Theorem 9.14. Let A be a C*-algebra A equipped with an asymptotically abelian 
action Z of7‘^, and let (Obea translation-invariant primary state onA( i.e., T* CO = CO 
for all X G Z‘^). Then Qo> is the only translation-invariant vector in Ha. Moreover, 

lim co{aZx{b)) 

X—>-oo 

lim na{Zx{b)) 

X^oo 

lim |A|^' y TtmiZxib)) 

XCA 
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= co{a)(o{b) 

{a,b G A); 

(9.30) 


(bGA); 

(9.31) 


(bGA). 

(9.32) 
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Here (9.31) and (9.32) hold in the weak operator topology on B{Ha,), and the limit 
A t in is taken along the hypercubes Afj in (8.153) as N ^ 

Proof. If (0 is primary, Theorem 8.23 (or its proof) yields 

lim \ 0 }{axAb)) - Co(a)co(TJb))\ = 0. (9.33) 

X—^oo 

Translation-invariance of (O then yields (9.30), which also is a lemma for (9.31) - 
(9.32). Towards (9.31) we compute (o{aXx{b)) in terms of the projection 

eo = lim |A|^' V u{x) (9.34) 

AW 

onto the translation-invariant subspace of Ha, where u is the unitary representation 
of l/ on Ha from Corollary 9.12 (with G = if), and the limit is taken in the strong 
operator topology. Eq. (9.34) is a special case of von Neumann’s if ergodic theo¬ 
rem (which generalizes the Peter-Weyl-Schur relation eo = jQdxu{x) for compact 
groups G to amenable groups like or K'^). Since eo£2a — ^a>, we have 

C0{aXx{b)) = {Qa,7ta{a)7ta{Xx{b))Qa) (9.35) 

= {Qa,na{a)i[na{Xx{b)),eo]+eona{b))Qa)- (9.36) 

We now let x ^ The commutator then vanishes, because the weak limit of 
'^co{'^x{b)) lies in the center of na{A)", which is trivial since O) is primary. The 
remaining term matches with (9.30) iff eo is one-dimensional, so that Qa is the only 
translation-invariant vector in Ha, and eo = |f2(B)(f2(o|. A similar trick then yields 

7ta{Tx{b))7ta{a)Qa = i[7ta{Txib)),7ta{a)] + 7ta{a){[7ta{Txib)),eo] + (0{b)))Qa- 

Both commutators vanish (weakly) as x —oo, proving (9.31). Similarly, write 

na{Xx{b))na{a)Qa = {[na{Xxib)),na{a)] + 7i:a{a)u{x)na{b))Qa, (9.37) 

and use (9.34) and the previous formula for eo to prove (9.32). □ 

In the C*-algebraic formalism, dynamics is described by a continuous homomor¬ 
phism a : M —Aut(A), 1 1 — (X,. For A = B{H'), where H' is some Hilbert space (not 
to be confused with our earlier H in the quasi-local setting). Theorem 5.4 yields 

at{a) = u,au* (9.38) 

for some family of unitaries Ut = u{t), f G M. Eq. (5.268) and Proposition 5.53 then 

imply that the family Ut may be redefined so as to make the map f i—^ m, a continuous 
unitary representation of M on H'. Stone’s Theorem 5.73 finally gives the familiar 
expression for time evolution in the so-called Heisenberg picture in terms of the 
Hamiltonian h, which is a (possibly unbounded) self-adjoint operator on H', i.e., 

oc,{a)=e‘‘’‘ae-“’'. (9.39) 
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For arbitrary (unital) C*-algebras A one has no counterpart of Theorem 5.4, and 
one cannot rely on Theorem 9.11 either because there are no preferred states to begin 
with; such states typically require a time-evolution for their definition (see below). 
For quantum spin systems (still with // = C” and hence B(H)= Mn (C)), one tries to 
construct the map f i—7> a, from local approximations: with given by (8.129) with 
(8.128), we pick local Hamiltonians Ha G B(Ha) and define maps 1 1 —Aut(Ayi) by 

(9.40) 

where a G Aa - Letting A z' l/, we would then like to assemble the family into 
a single automorphism group a : M —Aut(A), which describes the dynamics of the 
corresponding infinite quantum system. Towards this aim, we start from a potential 
(also called an interaction) <P{X) G B(Hx), which is defined for any finite sublattice 
X of Z^, in terms of which the local Hamiltonians Ha take the form 

/lA - E ^(^)> (9.41) 

XCA 

where the sum is over all sublattices A of A. For nearest-neighbour interactions, 
‘P{X) is nonzero iff X = {x,y} is a pair of neighbours, and in the presence of an 
external magnetic field one also has terms proportional to ^>({x}). For example, 
the quantum Ising model is defined by // = and ^>({x,y}) = —J 0^{x)0i{y) for 
nearest neighbours and 0({x}) = —Bai{x) for all x, where 7 > 0 and B G M. The 
local Hamiltonians are therefore given by 

hA=-J Oi{x)a^{y)-BY^Oi{x), (9.42) 

x^A 


where the sum over {xy) G A denotes summing over nearest neighbours in A. The 
expression (9.42) implicitly has so-called free boundary conditions, in that only 
neighbours inside A take part in hA ■ Alternatively, one could use periodic boundary 
conditions, which \ad define the quantum Ising chain 

( N-l \ N 

E(ff3(x)(J3(x+l) + a3(A^)ff3(l) (9.43) 

x=\ J x=l 

In (9.42) - (9.43) the operators 0', (x) in Aa is defined as explained after (9.29). We 
are going to study the quantum Ising chain in detail in connection with SSB; for 
the moment, we just mention another popular spin model, namely the Heisenberg 
model for magnetism. This also has H = C^, but the local Hamiltonians are 

3 

hA=j Y ( 9 - 44 ) 

{xyeA)i=l 


with free boundary conditions, where 7 < 0 ( 7 > 0) yields (anti) ferromagnetism. 
Although we do not have (9.38) for any Ut G A, we may construct a, as follows. 


“PuXtJC. T^flxLLltXLMtXLtljCjaJ. T^lLy-A-LC-A. 



9.3 Motion in space and in time 


349 


Theorem 9.15. Let <P be a short-range potential in that there is r gN such that 
^{X) ^ 0 only if |x —y| < r for all x,y G X, and define local Hamiltonians h/^ by 
(9.41). For fixed finite A G'Z‘^ and a G Ax, the following (norm) limit exists and 
defines an automorphism at and hence by continuity also of A: 

at{a) = \im e“^''Nae^“'''^N ^ (9.45) 

N^oo 


Proof Note that for large enough N, the hypercube A^ contains any A G 
Take a G Ax, take A]V 2 D Ax^ D A, and use (9.40) and (9.41) to compute 


= II J‘dsl(al^^^^oa;y(a))\] 

ds (^[hx „^, o (a)] - ^ (a)]) 

^ ds ^ ([/iA„^ - hxft ^, ^ («)]) 


< f^ds\\ay^^\[hx,^-hx,^,a!;y^\a)\)\\ 

< ^ tfs II [hxtt^ - hx„^, ' (a)] || 



^ j:[0{x),ayy{a)] 

xeAN^\AM^ X3X 


< ^ ^ fds\mx),aiy(a)]\\. 

^CApt^ \Api^ X3x 0 


(9.46) 


We now show that the left-hand side of the first line is a Cauchy sequence. Since 
ay ’ (a) = <p{y) ^ 

which is finite-dimensional (as A^j is finite), we have a norm-convergent expansion 


ay^\a)=a + it ^ [<l>{Yi),a] 

YiQAp/^ 


2 ! 


^ [0(72), [^(Ti), 


(9.48) 


Yi,Y2GApi^ 


Let A (r) consist of all y Glf^ for which there is some x € A for which |x — y| < r. 
Then the zeroth term a in (9.48) is in Ax, the first is in Ax[r), ■ ■■, the n’th is in 
A A (nr). Therefore, we can find n = n{Ni,N2,3) such that the only terms in (9.48) that 
contribute to the commutator in (9.46) are the n’th and beyond. Taking A^/j and Ax 2 

large enough, this tail can be made arbitrarily small, so that {a))N is a Cauchy 

sequence in A. This gives convergence of (9.45) for a G Ax, where A is arbitrary 
(but finite), yielding an automorphism a, in UaAa . Being an automorphism, a, is 
isometric, so that it extends to A by continuity. □ 
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9.4 Ground states of quantum systems 

A ground state of a finite system Aj\ = is an eigenstate of the local Hamil¬ 

tonian Ha with the lowest eigenvalue; because dim(//A) < the spectrum of is 
discrete and hence local ground states exist. For infinite systems, no Hamiltonian is 
yet defined, so we need to define ground states in terms of the dynamics a,. 

Definition 9.16. Let A be a C*-algebra with time evolution, i.e., a continuous ho¬ 
momorphism a : R —> Aut(A) (which gives the dynamics of the underlying physical 
system). A ground state of (A, a) is a state (O on A such that: 

1. CO is time-independent, i.e. (X*CO = (0 (or co((Xi(a)) = (o(a) for all a G A)yt G R; 

2. The generator ha, of the ensuing continuous unitary representation 

t^u,= (9.49) 

o/R on Ha, has positive spectrum, i.e., cy{ha,) C R+, or, equivalently, 

{\l/,ha,\l/)>0 {\l/GD{ha,)). (9.50) 

Note that the existence of the operator ha, is guaranteed by Corollary 9.12 and the 
arguments after (9.38). Since Corollary 9.12 yields 

ha,Llm=Q\ (9.51) 

Tloiichia)) = e'*^<^%a,{a)e^'*'^’^, (9.52) 

it follows that ha, is a Hamiltonian in the usual sense, implementing the Heisenberg- 
picture time evolution (albeit in the representation 7ta,{A) rather than in A itself). 
Moreover, in view of (9.51) and the assumed positivity of a(ha,), the unit vector 
f2(o of the GNS -representation na> induced by a ground state co is a ground state 
for the Hamiltonian ha, in the usual sense. If co is pure (see below for a discussion 
of this desirable possibility), then obviously exp{itha,) G na,{A)", since the latter 
equals B{Ha,). A deep result states that this is always the case (Borchers Theorem): 

Theorem 9.17. If co is a ground state on A, then exp(itha,) G na>{A)'' for all f G R. 

As we shall see, this contrasts with equilibrium states. The Heisenberg equation of 
motion for operators a{t) has a counterpart in the C*-algebraic formalism, which 
requires a concept already encountered in §3.1, but repeated here for convenience: 

Definition 9.18. A derivation on a C*-algebra A is a linear map 5 : A —A with 

5{ab) = 5{a)b-\-a5{b), (a, G A) (Leibniz rule). (9.53) 

An unbounded derivation is a linear map 5 : Dom(5) —A, where the domain 
Dom(5) G1A of S is a dense linear subspace of A, that satisfies the Leibniz rule. 

An (unbounded) derivation 5 is symmetric when 5(a*) = 5(a)* for all a (in 
Dom(5), which must be self-adjoint in that a G Dom(5) iff a* G Dom(5)). 
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Bounded derivations are rare in classical physics; nonzero derivations of A = Co(IR‘^) 
do not even exist, but it has plenty of unbounded derivations, viz. 5{f) = for 
some vector field ^ on In quantum mechanics, A = B{H') does have derivations, 
all given by 5(a) = t[/t,a] for some bounded (self-adjoint) operator h on H'. 

Proposition 9.19. Any continuous homomorphism a : K —>■ Aut(A) on any C*- 
algebra A defines an unbounded symmetric derivation 5 on A by the norm limit 

^(a) = 4«t(a)|r=0 = (9-54) 

dt ' t-5-0 t 

where Dom(5) consists of all a G A for which this limit exists. Moreover, this domain 
is stable under a, in that if a G Dom(5), then a,(a) G Dom(5) (t G M). 

The proof is an elementary verification (cf. Theorem 5.73). On //a we then have 

7!:o,(d(a)) = i[ho,,7!:o,(a)], (9.55) 

which, then, is “Heisenberg’s equation of motion revisited.” One may also reformu¬ 
late Definition 9.16 in terms of the derivation 5 associated to a by (9.54): 

Proposition 9.20. A state CO G 5(A) is a ground state for given dynamics (X iff 

—/a)(a*5(a)) > 0 (a G Dom(5)). (9.56) 

Proof. If 0 ) is a ground state according to Definition 9.16, we may use (9.55), 
(C.196), (9.51), and finally (9.50) to compute 

i(o(a 5(a)) = i(f2(H,TTffl(fl 5(af)Q,(f) = (Q,(x),K(o(a) \h(x),K(a(a))Q,(o) 

= (7t(j)(a)fl(o,h(o7l(o(a)Q(Q) >0. (9.57) 

Conversely, we first show that if (O satisfies (9.56), then it is a^-invariant. We initially 
assume a = a*, so that 5(a)* = 5(a*) = 5(a), as 5 is symmetric by construction. 
Since O) is a state, one has (o(b*) = (o(b) for any b GA, so taking b = 5(a)a, using 
(9.56) just in that (o(a*5(a)) G M, we obtain co(5(a)a) = —Co(a5(a)). Hence 

ai(5(a2))=0, (9.58) 

by (9.53), so also co(5(as(a)^)) = 0, s G M. With (9.54), we find 

0= dsco(5(as(a)^)) = / dsco ( —a,(as(a)^)u^o 

Jo Jo \dt ' 

d d 

= ds — co((x,+s(a)^))u^o= ds — co(a,(a)^)) = (o(au(a^))-co(a^). 

Jo dt ' Jo ds 

Hence 0)(au(a^)) = co(a^) for each u> 0 (and analogously for each u < 0), when¬ 
ever a* = a, i.e., (o(au(b) = (o(b) for each b>Q. But any b gA may be written as 
a sum of at most four positive elements, so coo Uu — (0 for all m G M. We therefore 
have a Hamiltonian h^, whose positivity follows from (9.57), ran backwards. □ 
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9.5 Ground states and equilibrium states of classical spin systems 

Thermal equilibrium states are arguably physically more relevant than ground states, 
as the latter rely on the idealization of temperature zero. Since in statistical mechan¬ 
ics infinite systems are used to approximate very large ones, it will be of particular 
interest to define equilibrium states in infinite volume. If only to highlight contrasts 
with quantum theory, we take a long run and start with the classical case. 

Classical spin systems on a lattice are defined by a single-site configuration space 
n = {0,1,.. . ,n}, where m G n may either be interpreted as some spin-like degree 
of freedom (as in the Ising model, where n = 2) or as the number of (structureless) 
particles occupying a given site (in which case one has a lattice gas). As in (C.310), 
for any finite sublattice A C Z^, the local algebra of observables is given by 

Ai")=C(n^), (9.59) 

where = C{A,n) consists of all functions s \ A ^ n. For finite A this is a finite 
set (of cardinality nl^l), so that all functions in question are continuous and hence 
C{n^) just stands for the commutative C*-algebra of all functions from to C. If 
Ai C aP), we have maps ^written fi i—>■ / 2 , which are given 

by 

/2(*)=/i(*|A,), (9.60) 

where s : ^ n. As these maps are injective, the ensuing inductive limit is simply 

(9.61) 

where n^‘‘ = tL is endowed with the product topology and hence (by Ty- 

chonoff’s theorem) is compact (for n = 2,d = 1 this is a model of the Cantor set). 

As in the quantum case, local Hamiltonians are defined via an interaction <l>, 
which now is an assignment X t-A <P{X), where A C Z'^ is finite and ‘P{X) GA^^\ 
If A C T, we regard ^’(A) an an element in Ay ^ through the inclusion A^^ C AyK 
indicating this explicitly by writing ‘P(X)y G Ay ^ We then define G Aj[^ by 

hA=Y.^iX)A, (9.62) 

XCA 

where the the sum is over all subsets A of A. For example, the Ising Hamiltonian 

hA (s) = -J (9-63) 

(0>A 

where the sum is over nearest neighbours in A, and we assume 2= {—1,1} (rather 
than the usual c-bit {0,1}), comes from the following potential: 

• 0(A) = 0 if either |A| > 2 or, if |A| = 2, its elements are not nearest neighbours; 

• 0({/}) : s I—>■ —Bsi, and 0 ({/, 7 }) : s i ——JsiSj if i and j are nearest neighbours. 
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As in (9.41), the prescription (9.62) has free boundary conditions, in that it only 
involves spins inside A. Another possibility is to fix a “boundary” spin configuration 
b € and define G by 

^A= E (9.64) 

xcZ‘',|x|<oo,xnA^0 

This involves some new notation <P(X)^, which means the following. In principle, 
^>(X) G is a function on n^. We now turn *P(X) into a function 0{X)'^ on 
(so that is a function on as required): for given s:A^n and given b:l/ 
we define s' \ X ^ nhy putting s' = s on A fl A and s' = h on the remainder of X 
(which is X fl A", with A" = l/\A). Then 

0(A)^(s) = 0(A)(s'). (9.65) 

Physically, this simply means that those spins outside A that interact with spins in¬ 
side A are set at a fixed value determined by the boundary condition b. For example, 
consider the Ising model in = 1. If we take A ={2,3}, then from (9.62) we obtain 
fiyv = —7 s2S3 — B{s 2 + S 3 ); spins outside A do not contribute. From (9.64), on the 
other hand, we obtain = hy\— J{b\S 2 + S 3 h 4 ). Although the boundary condition 
b is arbitrary, one may think of simple choices like h, = 1 or — 1 for each i. 

We may actually rewrite (9.64) as a difference between Hamiltonians with free 
boundary conditions. To do so, for given finite A we pick some finite A' D A large 
enough that it contains all spins outside A that interact with spins inside A (provided 
this is possible). With the conventional notation hA{s\b) = h^{s), this yields 

hA{s\b) = hA,{s,b)-h^,\A{b)= E <P{X'U'is,b)- E ^{YU'\Aib). 

X'cA' YCA'\A 

Analogous to (9.65), the notation 0(A')^/(s,h) here means 0(A')^/(s'), for the 
function s' : A' that on A C A' coincides with s:A^n, whilst on (A'\A ) C A' 
it coincides with the restriction of b to A'\A. Thus we may also write 

hAis\b)= \im {h^,{s,b)-hA,\^{b)), (9.66) 

although neither h^d{s,b) nor hjd\^/^{b) makes sense by itself. Periodic boundary 
conditions for local Hamiltonians may be defined for arbitrary interactions <l> and 
special lattices. For example, the Ising chain in c/ = 1 has local Hamiltonians 

^{L2....,n}(■*) = + E ~^(9-67) 

Naively, a ground state of a finite classical spin system, i.e., a system of the 
above kind defined on a fixed finite lattice A C is a spin configuration sq G nfi 
that minimizes the local Hamiltonian hj\ (9.62), or its counterpart (9.64), that is. 
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/ia(so) <(9.68) 

for all s G ■ For example, if A is a hypercube An, then the Ising model (9.63) 
has a unique ground state for B > 0, namely so(-^) = 1 for all x G A, whereas it 
has two ground states sj for B — 0, given by sj(x) = ±1 for all x. Ground states 
of finite classical systems always exist (since the space on which is finite), but 
they are not necessarily unique; we just gave a counterexample! The same is true 
for quantum theory, since for B = 0 also the quantum Ising model (9.42) has two 
degenerate symmetry-breaking ground states. Nonetheless, this case is special, since 
for nonzero small values of B the ground state of the quantum Ising model is unique 
for finite A, whereas on the infinite lattice it is degenerate (cf. §10.7). 

The definition of ground states of infinite classical spin systems is just slightly 
more involved; for local Hamiltonians with free boundary conditions defined by 
an interaction 0 a la (9.62), a ground state is a point iq € for which 

/Ja(so|a) </Ja(s|a), (9.69) 

for any finite A and any spin configuration s G . Alternatively, one may ask 

(9.70) 

for all finite A and all spin configurations s G that coincide with so outside 
A, where stands for (9.64) with b = sq. In other words, so provides a boundary 
condition b, which is fixed for all s that compete with so in minimizing the local 
Hamiltonian . Both definitions give the usual two ground states for the Ising 
model with B = 0 (in which all spins are either “up” or “down”), but the second 
one also opens the possibility of domain walls, where infinite chains of “spin up” 
alternate with infinite chains of “spin down”, and similarly in higher d. 

If different ground states in the above (“pure”) sense exist, we may reinterpret 
such states so as Dirac measures 5io on the space nfi of all spin configurations on A, 
and may also allow convex combinations of ground states as ground states. This, as 
well as the analogy with Definition 9.16 (in which no purity condition is imposed) 
inspires a more liberal definition of a ground state, which is predicated on Boltz¬ 
mann’s idea that a state of a classical system of the kind we consider is a probability 
measure on n^, and likewise for . In the C*-algebraic formalism we use, this 
follows from (9.61) and the identification of states on C(A) with completely regular 
probability measures on X (assumed to be a compact Hausdorff space, cf. §B.5). A 
state II on C(n^ ), i.e., a probability measure on , induces a state on each local 
algebra C{n^ ), i.e., a probability measure /Ta on nfi simply by restriction, since 

C(n^) CC(n^‘') (9.71) 

through the injection (9.60), according to which /a G C{nfi) has image / G C{n^'‘) 
defined by f{s) = /a (sia)- The measure /Ta, then, is given in terms of /r by 
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llA{fA)=ll{f)\ (9.72) 

the corresponding probability distribution (i.e., (s) = ({■s})) is given by 

PA {s) = P ({s' G I s|a = s}) , s&n^. (9.73) 

The family of probability measures {p/^) defined by p is consistent in that if A C 
A(^( and /i G and /2 G are related as in (9.60), then 

Ma(i)(/i)=MaP)(/2)- (9.74) 

Conversely, a consistent family of probability measures {pj\) defines a unique prob¬ 
ability measure p on which induces the given family through (9.72). 

Definition 9.21. For given finite A C a probability measure jtl® on n^ is a 
ground state of a local Hamiltonian hj\ (with free boundary conditions) if in terms 
of the probabilities p® (s) = p^ ({s}). far any probability measure Pj\ on ifa, 

E PAi^)f^A < Y. PAifahA- (9.75) 

s€:ri^ s€n^ 

A probability measure po on is a ground state /or some interaction <P if (9.75) 
holds for any probability measure p on nf and any finite subset A C Z'^, where this 
time p® (and analogously pfa is defined by (9.73). 

In particular, convex sums of pure ground states are ground states in this more gen¬ 
eral sense, so that, if all pure ground states break some symmetry (as is the case 
for the Z 2 -symmetry s i—— s of the Ising model at B = 0), symmetric convex sums 
will restore the symmetry. The set of all ground states of a given interaction is a 
convex set, whose extreme points are the pure ground states (at least, under suitable 
hypotheses on 0). This leads to a discussion of SSB similar to the quantum case. 

In the following discussion of equilibrium states, we use the notation 

Pr(X) ^ 5(C(X)) (9.76) 

for the compact convex set of all completely regular probability measures on X, 
which as above will either be the finite set n^ (with discrete topology)—on which of 
course any probability measure is completely regular—or the compact space rfa . In 
the first case we may as well use probability distributions pa (instead of probability 
measures) on rfa. In the second, we could also use Baire measures. 

Given an interaction 0 and the ensuing family (9.62) of local Hamiltonians hA, 
we define the local energy for each finite A C Z'^ as a function S'a : Pr(n"^) —M by 

<Sa{pa)= Y PAi^)^A{s)- (9.77) 

s€n^ 
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Of course, this is just the expectation value of the Hamiltonian in the state pj\. The 
local entropy Sa ■ Pr(n'^) —>^ M is a more subtle concept; rather than the expectation 
value of some (local) observable, it specifies a property of the probability distribu¬ 
tion itself. With Boltzmann’s constant kg, we have 

Sa{pa) = -h Y. f’A(s)ln(pA(s))- (9.78) 

s€n^ 

Note that Sa (pa) ^ 0, with equality iff pA is a pure state (i.e., pA is supported at a 
single spin configuration). The local free energy : Pr(n^) —K is defined as 

^^=^a-TSa, (9.79) 

where ji — I /ksT. A local equilibrium state, then, is a probability distribution p^ 
that minimizes the free energy (for fixed temperature T). 

Theorem 9.22. For each T > 0, there is a unique local equilibrium state, given by 

the Boltzmann distribution (and associated partition function) 

^A= Y 

The associated free energy in equilibrium is then given by 

=^^(PA) = -i3^'lnZ^. (9.82) 

Proof. The claim follows from the fact that any pA G Pr(n"^) satisfies the inequality 

^^(pA)>-j3-MnZ^, (9.83) 

with equality iff p = p^, i.e., using (9.79), (9.77), and (9.78), we need to show that 
Y P('5)(/*A(s)+j3^^1np(s))+j3^'lnZ^ >0. (9.84) 

seE^ 

Using (9.80), for each s G we obtain 

-^hA{s)^\nz{+\np{{s). (9.85) 

Substituting this in (9.84), using Y.sP{^) = omitting the ensuing prefactor j3^*, 
and noting that (s) > 0 for all s, the inequality (9.84) to be proved becomes 

^ p(^)ln(^^ >0. (9.86) 

VPaU)/ 
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Hence we need to prove the inequality 


E 

seE^ 




> 0 , 


(9.87) 


with equality iff p{s) = {s) for all s. Let us note that the function f{x) = xlnx is 

strictly convex for all x > 0, that is, for any finite set of numbers p'{s) G (0,1) with 
LsP'(^) = 1 of positive real numbers (xs)^ > 0, we have 

> f , (9.88) 

with equality iff all numbers x* are the same. Applying this with p'{s) = p^ (s) and 

Xs = p{s)/p^{s), so that p'{s)xs = p{s) and hence '^sP'i^)^s = ’LsPi^) = which 
makes the right-hand side of (9.88) vanish since ln(l) = 0, finally leads to (9.87). 
Equality arises iff p{s) / p^ (s) equals the same numer c for all s; summing over all s 
forces c = 1, so that one has equality iff p{s) = p^ (s) for all s, as desired. □ 

Neither the local Hamiltonians (9.62) nor the local partition functions (9.81) have 
a limit as A f A precise definition equilibrium states of infinite classical systems 
was given in 1968 by Dobrushin and by Lanford and Ruelle (dlr). 

Definition 9.23. For fixed inverse temperature j3 € (0,oo) and fixed interaction < 1 >, 
a Gibbs measure is a (Baire = regular Borel) probability measure on n^ such 
that for each finite A dl/ and each pair {s, b) of a spin configuration s : A —^ nplus 
boundary condition b : A‘^ —>■ n, the conditional probability {s\h) for the events 


s = {s' Gr^ I = s} C ; (9.89) 

b = {s" G n^ I =b} , (9.90) 

is given in terms of the local Hamiltonian hx (sl^) as defined by (9.66) by 

Ai^s|b) = (9.91) 

z{{b) = ^ ( 9 , 92 ) 


Recall that /r^(s|b) = /r^(sn b)//r^(b), where sflb = {s^} consists of the single 
spin configuration Sb : l/ ^ n that coincides with s on A and coincides with b on 
A‘^. Thus we may write (s|b) = p^{sb) /where p^{s) = /i^({s}) as usual. 

It was initially unclear how to generalize this highly fruitful definition of equilib¬ 
rium states in classical statistical mechanics to the quantum case, where conditional 
probabilities are not well defined (this was eventually resolved, however, through 
Definition 10.9 below). Thus a different (equally fruitful) approach to equilibrium 
states of (infinite) quantum systems was developed, to which we now turn. 
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9.6 Equilibrium (kms) states of quantum systems 


For finite quantum spin systems we have expressions for the energy the en¬ 
tropy Sa, and the free energy ^a that are analogous to their classical counterparts 
(9.77), (9.78), and (9.79). In particular, these quantities are functions on the state 
space S{Aa)- Since Aa = B{Ha), where we assume that H and hence Ha is finite¬ 
dimensional, each state (Oa G S{Aa) is given by a density operator Pa, so that 

<C\ ) = cua (tiA ) = Tr (pa liA ); (9.93) 

5a(®a) = -^BTr(pAlnpA); (9.94) 

^^=iA-TSA. (9.95) 

Defining a local equilibrium state as a density matrix p^ that minimizes the free 
energy (for fixed T), we have the following quantum analogue of Theorem 9.22: 

Theorem 9.24. For each T > 0, there is a unique local equilibrium state CO^, viz. 

= Tr(p^a); (9.96) 

p^ = {Z^)-^e-P’'^-, (9.97) 

= Tr . (9.98) 

Accordingly, the free energy in equilibrium is given by 

=^A(pA) = -i3^'lnZ^ (9.99) 

Proof. One proof is analogous to the classical case, in that for all Pa G &{B{Ha)), 

=#jf(pA)>-j3-‘lnZ^, (9.100) 

with equality iff Pa = Pa - This, in turn, follows from the inequality 

Tr(fl(lnfo-lna)) <Tr(fo-a), (9.101) 

with equality iff b = a, which is valid for matrices a,b for which a > 0 (in the usual 
sense that A > 0 for each X G O' (a)) and > 0 in that A > 0 for each X G o(b). The 
case a = Pa and b = p^ immediately gives the claim. □ 

What remains to be done, however, is to define equilibrium states for infinite sys¬ 
tems. This is achieved through the so-called KMS-condition, which is based on the 
observation that for any a,b G Aa, in terms of (9.40) the state (9.96) satisfies 

(0^{(4'^\a)b) = (O^ibal^^^ia)) {t G K). (9.102) 
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Moreover, in finite systems this condition (even at f = 0) fully characterizes co^: 

Proposition 9.25. Let h be a self-adjoint operator on a finite-dimensional Hilbert 
space H', with associated density operator p and (complex) time-evolution given by 


^ Tr(e-'')’ 

afia) = e'^ae^'^, z G C,a G 


(9.103) 

(9.104) 


respectively (the exponentials being defined by a norm-convergent power series). 
Then the associated two-point functions defined by C 0 (a) = Tr (pa) satisfy 

(o{ab) = (o(bai{a)) (a,b&B(H)). (9.105) 

Conversely, any state for which (9.105) holds for given h and is given by (9.103). 

Proof Eq. (9.105) follows from (9.103) - (9.104) and cyclicity of the trace, i.e., 
(A.78). Similarly, given non-degeneracy of the Hilbert-Schmidt inner product (B.495) 
on B(H), eq. (9.105) is equivalent to the condition 

pa = e^^ae^p, (9.106) 

for each a G B(H'). Multiplying with exp(/i) shows that exp(/i)p commutes with 
every a G B(H'). Since B{H')' = C • 1///, we obatin exp(/i)p —X-\h- Since exp(/t) 
is invertible with inverse exp(—h), we obtain p = X ■ exp(—h), upon which the nor¬ 
malization condition Tr(p) = 1 yields (9.103). □ 

For arbitrary C*-algebras A with time-evolution 1 1 — at, expressions like o:,+,j 3 (a) 
may not be defined, so one has to proceed more carefully, but the idea is the same. 

Definition 9.26. Let A be a C*-algebra with an automorphism group K. A KMS 
state at “inverse temperature” j3 G M is a state co on A with the following property: 

1. For any a,b G A, the function Fa^b ■ t CO (bat (a)) from K to C has an analytic 

continuation to the strip 


,5^j3={zGC|0<Im(z)<j3}, (9.107) 

where it is holomorphic in the interior and continuous on the boundary 

dy’p=RC(R + ili). (9.108) 

2. The boundary values ofFa^t cire related, for all f G K, by 

Fa,b(t) = (o(bat(a)y, (9.109) 

Fa,b(t + ili) = (o(at(a)b). (9.110) 

If this is the case, CO satisfies the KMS-condition at (inverse temperature) j3. 
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It is easy to show that A has a dense subset Aa such that for any a € Aa the function 
1 1 -^ at (a) from K to A extends to an entire A-valued analytic function, written z i—>■ 
aj.(a) (i.e., for each (p gA* the function z i—> <p(a^(a)) from C to C is entire analytic). 
Namely, for any a G A and e > 0, define 

ae= [ (9.111) 

which satisfies Ug G Aa and limej^o ag = a. If A — B{H') with dim(//') < o°, we even 
have B{H')a = B{H'), since (9.104) is entire analytic in z for any a G B{H'). For 
any A, the KMS-condition on (O is then equivalent to the simpler requirement 


(o{ab) — (o{bajp{a)) {aGAa,bGA). (9.112) 

Corollary 9.27. If A = B{H') with dim(//') < then KMS states (at fixed j3) are 
necessarily given by the equilibrium states of Theorem 9.24 and hence are unique. 

Although initially the characterization of equilibrium states of infinite systems by 
the KMS condition was tentative, in the 1970s and ’80s it became clear that it was 
spot on, being equivalent to local and global thermodynamic stability (against per¬ 
turbations of the dynamics), the (local) maximum entropy principle, etc. Also: 

Proposition 9.28. A KMS state at j5 G 1R\{0} is time-independent. 

Proof. We just sketch the proof if A is unital. Taking b = 1a, for fixed a G Aa the 
function Pa.i^ = F defined by F{z) = (o{afia)) is entire analytic on C. Writing 
z = t is (with s,t G K), we have = Of o ats and hence (since each a/ is an 
automorphism and hence an isometry), + A)| < ||a,j(fl)||. Also, (9.112) yields 
F{t i{s -f P)) = F(t-\-is). Hence F{t-\-is) is bounded in t and periodic in s; by the 
latter property its supremum on C may be computed by its supremum on the strip 
and by the former property this supremum is finite. Therefore, F is bounded, 
and so by Liouville’s Theorem it must be constant, especially if z = f G M. Hence 
a*co{a) = co(a) for each a G Aa, and since this is a dense set, a*CO = CO. □ 


By the argument for ground states following Definition 9.16, the automorphism 
group f I—a, is unitarily implemented in the GNS-representation TTa induced by 
a KMS state co, such that (9.51) - (9.52) hold. However, the operator hat in this con¬ 
struction should not be confused with the Hamiltonian of the system. For example 
suppose A = B{H') for some (not necessarily finite-dimensional) Hilbert space H', 
so that (9.39) holds for some (not necessarily bounded) Hamiltonian h with discrete 
spectrum, such that exp(—j3/z) G Bi [H'). If we now define the density operator 


e-Ph 

P = T, ^e-Ph) ’ 


(9.113) 


then the corresponding state co satisfies the KMS-condition at j3. Generalizing the 
computations around (2.66) in §2.4, we then find (up to unitary equivalence): 
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(9.114) 

Kco{a)b — ab', (9.115) 

i2«=p'/2; (9.116) 

(9.117) 

where for any a G B{H'), the operator 7r^(a) on B 2 {H') is defined by 

%'^{a)b = ba. (9.118) 


Note that (9.115) is well defined, since p > 0 and p G B\{H'), whence p^/^ G 
B 2 {H'), and hence also ab G B 2 {H') and ba G B 2 {H'), since B 2 {H') is a two-sided 
ideal in B{H'). If h happens to be bounded, we may therefore write 

h^ = 7t^{h)-7tUh). (9.119) 

Note that the term in (9.117) is not needed for (9.52), since [7t(o{a),7t'(i,{b)] = 0 
for any a,b G B{H'), but it is necessary to secure (9.51). Another feature of this 
example is that the vector is not only cyclic for na){B{H')), which it has to be 
by virtue of the GNS-construction, but also separating, i.e., 7ta{a)iio) = 0 implies 
^a>{o) = 0. In other words, one has co{a*a) = 0 iff a = 0 (which is by no means the 
case for ground states). If dim(//') < oo, this is obvious, because n(a{a)Q,(o = ap^/^ 
and p'/^ is invertible. In general, for arbitrary C*-aIgebras A we have; 

Proposition 9.29. Let co be a KMS state on A at (5 G K. Then is both cyclic and 
separating for 7t(o{A) and hence also for 7ta)(A)" (as well as for TtcoiA)'). 

Proof Since (o{a*a) = \\7t(o{a)Q(o\\^, we have co(a*a) = 0 iff 7t(o{a)Q(o = 0, so that 

C0(a*atia)) = {7t(o{a)Q(o,7t(o{at{a))Q(o) =0 (t G K) 

if co{a*a) = 0, and hence Fa*^a{t) = 0, cf. (9.109). The “edge of the wegde” theorem 
then gives Fa*^a{z) = 0 for all z G upon which the KMS-condition gives 

(0{aa*) =Fa*,a{ifi) =0. 

This means that co{a*a) = 0 iff co{aa*) = 0, or 7ta,{a)Q(o = 0 iff TtmiafQco = 0, and 
hence n(i,{b*)na{a)£2(o = 0 iff n(o{a*)na){b)£2(o = 0. Since is cyclic for TtaiA), 
the assumption 7tco{a)Qm = 0 therefore implies that the bounded operator nco{a*) 
vanishes on a dense domain in Ha and hence vanishes. Since 7to)(a) = {7t(o{a*))*, it 
follows that 7to)(a) = 0. The extension to na{A)'' (and itaiA)') is obvious. □ 

Corollary 9.30. IfO) is a KMS state on a quasi-local algebra A, i.e., given by (8.130) 
with dim(//) < then CO{a*a) =0 iff a = 0 and hence the GNS-representation 
71(0 -A^ B{H(o) is injective. 

Proof. By the previous proof, the closed left-ideal (C.204) is actually a two-sided 
ideal, which must be zero, since A is simple (as is easily shown from the simplicity 
of B(H) for finite-dimensional H, cf. §8.5). □ 
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Proposition 9.29 shows that the von Neumann algebra n(o{A)" is in standard 
form (see Definition C.158), so that the KMS condition bring us into the realm of 
the Tomita-Takesaki theory. In particular, Theorem C.159 provides us with another 
time-evolution, namely the one given by the modular group. In the situation of The¬ 
orem C.159, we take a G Ma and b G M, and compute 

{Q,ba-i{a)Q) = {Q,bAaA^'^Q) = {Q,bAaQ) 

= = {JA^/^an,JA^/^b*n) 

= {SaQ,Sb*Q) = {a*Q,bQ) (9.120) 

= {Q,abQ), (9.121) 

where we used the property A^I^Q. == as well as anti-unitarity of 7, which im¬ 

plies {J\i/,J(p) = {(p, y/); these facts follow from the definitions of A and J via S. 
Therefore, the state O) on M defined by (o{a) = {Q,aQ) (a G M) satisfies the KMS- 
condition for the modular group at j3 = — 1. If, on the other hand, we start with a 
jS-KMS state CD on a C*-algebra A with respect to some given time-evolution at, and 
iskeH — Hot), M — TtmiA)", and £2 = £2a>, the normal extension of (O to TtmiA)" given 
by {£2a,-£2(o) still satisfies the KMS condition with respect to the time-evolution on 
TtcoiA)” given by conjugation with exp{ithio), as in (9.52). Comparing the latter with 
the time-evolution on M defined by conjugation with A‘‘ (cf. Theorem C.159) gives 

=A-“/P, (9.122) 

since both one-parameter groups of unitary operators satisfy the KMS-condition at 

j3, and some time-evolution at that satisfies the KMS-condition relative to a given 
state (0 and inverse temperature j3 is unique. To see this (barring technicalities about 
unbounded operators that are easily dealt with), take j3 = — 1 for simplicity, assume 
at is conjugation by A'^ = exp{ith) (i.e., A = exp{h)), and rewrite (9.112) as 

C0{ab) = {b*Q,AaQ). (9.123) 

This determines {(p,A\j/) between a dense set of vectors q), \j/, and hence fixes A. 

The operators J and A from the Tomita-Takesaki theory can explicitly be com¬ 
puted in the example (9.113); the antilinear operator J : B 2 {H') B 2 {H') reads 

Jb^b*, (9.124) 

so that the isomorphism a JaJ between na,{A)" = B(H') (where B{H') acts on 
B 2 {H') by left multiplication) and its commutant 7rft)(A)' = B{H') (which copy of 
B{H') now acts on B 2 {H') by right multiplication) is given by JaJb = ba. Further¬ 
more, the (generally unbounded) linear operator A : B 2 {H') B 2 {H') is given by 

Ab = pbp-\ (9.125) 

which strictly speaking is defined as the closure of the expression (9.125) on the 
domain of all b G B 2 {H') for which G 
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Theorem 9.31. For given unital C*-algebra A, dynamics «:]£—>■ Aut(R), and in¬ 
verse temperature j3 G M, let Sp (A) be the compact convex set o/KMS states. Then 

deSfi{A)=Sfi{A)fiSp{A), (9.126) 

where Sp{A) is the set of primary states on A (cf. Definition 8.17). Consequently, 
extreme KMS states at fixed inverse temperature j3 are either equal or disjoint. 

This suggests that extreme KMS states define pure thermodynamics phases. 

Proof. We enlarge (A) to the set (A) C A* of all continuous linear functionals 
on A that satisfy the j3 -KMS condition (so that (A) consists of all positive elements 
in Kjj (A) of unit norm). The key to the proof is a bijection between the set S{(o) of 
functionals p G Kp{A) for which 0 < p < 0), where co G Sp{A) is fixed, and the set 
T(co) of operators c G 7r(o(A)' n such that 0 < c < 1//^, given by 

p{a) = {Cla,c7l(o{a)Cla). (9.127) 

This implies the claim, since O) G deS^ iff any p G S{(o) takes the form p = t(0 for 
some t G [0,1] (cf. Lemma C.17), which in turn is the case iff c = f • 1//^. 

First, for any state co G S(A) there is a bijection between the set of linear func¬ 
tionals p G A* for which 0 < p < CO and the set of operators c G K(o{A)' such that 
0 < c < 1//^, given by (9.127). Indeed, in one direction, given a = b*b > 0, we have 

{co - p){a) = {na{b)na>,{iH^ - c)nm{b)na>) > 0, (9.128) 

for if 0 < c < 1//^, then 0 < (l//„ — c) < Ih^- Hence p < CO, whilst from (9.127) 
we similarly find p > 0. Conversely, p induces a quadratic form R on Hq), defined 
initially on the dense domain n(p{A)H(p by the formula 

R{7l(o{a)Tlcp,7la{b)Cl(p) = p{a*b), (9.129) 

which is easily seen to be well defined, positive, and bounded, and so Proposition 
B.79 supplies the operator c, which a simple computation shows to be in n(o{A)'. 

For the bijection S{co) = T{co), where O) is a jS-KMS state as above, we therefore 
need the additional property c G n(o{A)”. Putting j3 = — 1 for convenience and using 
the notation of Theorem C.159, we first show that A =c for any t G K: indeed, 
since p satisfies the KMS condition, it is time-translation invariant, so that 

{7:cpia*)Qo„A-‘fiA‘‘7to,ib)Qa,) = (f2„,cA''7r«(fl)A-''A-“;r„(fi)A-“f2„) 

= {^(j)}Ctt(p{cXt{ab))£2o)) 

= p{cxt{ab)) =p{ab) 

= {7t(o{a )£2(o,c7tco{b)£2(o), 

so that A^“cA‘^ = c between a dense set of states, and hence this is valid as an 
operator equation. This also implies that c commutes with any power of A. Define 
c' = JcJ, which by Theorem C.159 is an element of n(o{A)'', and compute 
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{£2co,7tco{ci)c i2(o) = (Q(o,7t(o(ci)JcA^^^Q(o} = 

= {^(Oi^(o{o)ScQ(o) = {£2(j),7l(o{a)c ^ 2 ®) 

= {^(oj^(o{o)ci2(o) 

= p{a), (9.130) 

where we used the properties JQm = Qoyj cA^!^ = A^/^c as just 

mentioned, S = JA and c* =c (since c > 0). Finally, it follows from the KMS con¬ 
dition (applied to the normal extension of the state co to na){A)” given by (f 2 (o, •f 2 (o) 
as well as to the normal extension of p to TtmiA)" given by {Q(o,-c'Qa) just com¬ 
puted) that c' G na{A)', since for arbitrary a,b,d G Aa we have 

(o{ac'bd) = 0 ){ai{bd)ac') = p{ai{bd)a) = p{ai{b)ai{d)a) 

= p{ai{d)ab) — (o{ai{d)abc) = (o{abc d). 

In other words, for any a,b,d G A we have 

(jlcaia K(a{y)Ko)id)Q,(o) = {Tt(x){a )Q(D,7t(j){b)7t(o{d)c Qoy), (9.131) 

so that c'n(a{b) — nco{b)c' between vectors in a dense domain, so that this is an 
operator equality. Hence c' G TtcoiA)', and in view of this we may rewrite (9.130) 
as p(a) = {Qo[>,c'7ta){a)Qo[>)■ Since the operator c' G 7t(o{Ay in (9.127) is uniquely 
determined by p, this shows that c' = c. Since we already had c' G n(o{A)”, it follows 
that c G KaiA)' n 7lm{A)". □ 

It can also be shown that (A) is a (Choquet) simplex, which is a property rather 
more typical of the state space of a commutative unital C*-aIgebra; this makes it 
especially remarkable for the set of j3-KMS states on a highly non-commutative C*- 
algebra like the infinite tensor product of B = M„{C). In the physically relevant case 
where Sp (A) is metrizable, this implies that for any given KMS state (O GSp{A) there 
is a unique probability measure p on dgSp (A), such that for each a G A, 

C0{a)= f dp(co')co'(a). (9.132) 

JdSpiA) 

Conversely, any probability measure p on dgSp (A) defines a jS-KMS state by reading 
this equality from right to left. Towards the next chapter, suppose for example that 
there is a G-action on A, i.e., a continuous homomorphism 7 : G —> Aut(A) (where 
G is a locally compact group). Then G also acts on 5(A) via the dual maps (o) = 
(Oojg-i, and if G is a symmetry of the dynamics in that o 7 ^ = 7 ^ o a, for each 
f G K and g GG, then this dual action maps both Sp (A) and dgSp (A) into themselves. 
If G is compact with normalized Haar measure p, then for any fixed extremal KMS 
state 0)0 G deSp{A), by (left) invariance of p one obtains a G-invariant state by 

co= f dp{g)r;c 0 Q. (9.133) 

J G 
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Notes 

§9.1. Symmetries of C*-algebras and Hamhalter’s Theorem 

Theorem 9.4 is due to Hamhalter (2011). Our proof, taken almost verbatim from 
Landsman & Lindenhovius (2016) roughly follow his, but adds various details and 
also takes some different turns. The main differences with the original proof by 
Hamhalter are the following. Firstly, we give an order-theoretic characterization 
of u.s.c. decompositions of the form Kk (and hence of the commutative algebras 
in '^{C{X)) that are the unitization of some ideal) by the three axioms stated in 
Lemma 3.1.1 in Firby (1973), whereas Hamhalter uses Proposition 7 in Mendivil 
(1999), which gives a different characterization of unitizations of ideals. Further¬ 
more, Hamhalter only treats Lemma 9.5 in full generality, whereas in our opinion it 
is very instructive to take the case of finite sets first, where many of the key ideas 
already appear in a setting where they are not overshadowed by topological com¬ 
plications. Finally, our proof of Lemma 9.6.2 differs from Hamhalter’s proof. The 
topology of partitions may be found in Willard (1970), especially Theorem 9.9. 

Theorem 9.7 is due to Hamhalter (2015). Corollary 9.9 has a long history, starting 
with Jacobson & Rickart (1950) and ending with Thomsen (1982). 

§9.2. Unitary implementability of symmetries 

See Bratteli & Robinson (1987), §4.3. 

§9.3. Motion in space and in time 

For a far more detailed study of asymptotic abelianness see Bratteli & Robinson 
(1987), §4.3.2 and Bratteli & Robinson (1997), §5.4.1. Results like Theorem 9.14 
may also be found in Sewell (2002). Theorem 9.14 is also valid for ergodic states 
with respect to the given Z'^-action, where we say that a state on a C*-algebra A 
with G-action is ergodic if it is an element of de{S{A)^), i.e., extreme in the convex 
set of G-invariant states on A. Also Theorem 9.15 holds (with a more complicated 
proof, of course) under weaker conditions on 0, typically exponential decay in X. 

Theorem 9.15 is the simplest result in this direction; for similar results under 
weaker assumptions on the interaction 0, see Bratteli & Robinson (1997), §6.2.1. 

§9.4. Ground states of quantum systems 

The idea of a ground state of a quantum system may be attributed to Bohr (1913), 
who postulated that an atom has a state of lowest energy (which he called a “per¬ 
manent state”). See e.g. Pais (1986), p. 199. In this section, which merely present 
some key points treated in far more detail in Bratteli & Robinson (1997), §5.3.3. and 
§6.2.7, we have just scratched the surface of the topic, which is basic to physics. 

§9.5. Ground states and equilibrium states of classical spin systems 

Basic references for the mathematical physics of classical spin systems on a lat¬ 
tice are Israel (1979), Simon (1993), van Enter, Fernandez, & Sokal (1993), and 
Georgii (2011). One may now define pure thermodynamics phases as extreme el¬ 
ements of the compact convex set of all Gibbs measures (or of the set of all 
translation-invariant Gibbs measures, as in Simon, 1993, §lll.5), but there is no iden¬ 
tification between pure thermodynamics phases with primary equilibrium states (as 
in the quantum case), because a state on a commutative C*-algebra like C{t/‘ ) is 
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primary iff it is pure. Fortunately, the specific measure-theoretic setting of classi¬ 
cal statistical mechanics provides its own resources. For any A C Z'^, let be the 
smallest a-algebra (within the Borel cr-algebra for ) for which each / G C{n^ ) 
is measurable, and let 

= (9.134) 

A 

where each A is finite, be the a-algebra at infinity, with associated commuta¬ 
tive C*-algebra ) of all bounded measurable functions on that are Zcx,- 

measurable. This is the home of the macroscopic observables, defined as averages 
analogously to the quantum case. The role of primary states (or rather of states 
whose algebra of observables is trivial at infinity, as in Theorem 8.23) is now played 
by states that are trivial at infinity, that is, probability measures fj. on for which 
either n{X) = 0 or /r(X) = 1 for X G (cf. the Kolmogorov 0-1 law of probabil¬ 
ity theory). Indeed there is a classical version of Theorem 8.23, making exactly the 
same claim mutatis mutandis, see Theorem III. 1.6 in Simon (1993). The main result 
(cf. Theorem 7.7 in Georgii, 2011), is that a state is extreme in the compact convex 
set of all Gibbs measures (at fixed temperature and potential, of course) iff it is a 
Gibbs measure that is trivial at infinity. It follows that two distinct extreme Gibbs 
measures are mutually singular on Eoo (which is the pertinent classical version of 
disjointness of primary states). 

§9.6. Equilibrium (kms) states of quantum systems 

The KMS condition was introduced by Haag, Hugenholtz, and Winnink (1967), 
in the following equivalent form; 


J dtf{t — ifi)o}{a(Xt{b)) = J dtf{t)o}{at{b)a), (9.135) 

for each a,b G A and each Schwartz function / G ^(K). The name KMS derives 
from the earlier observation (9.102) of Kubo (1957) and independently Martin & 
Schwinger (1957). See also Haag (1992), Simon (1993), Borchers (2000), Sewell 
(2002), Thirring (2002), Emch (2007), and perhaps also, at a heuristic level. Lands¬ 
man & van Weert (1987), especially for applications of the KMS condition to quan¬ 
tum field theory at finite temperature and the quark-gluon plasma (this, incidentally, 
was the MSc thesis as well as the first major published paper by the author). 

The KMS condition also plays a major role in operator algebras and noncommu- 
tative geometry; see Connes (1994) and Connes & Marcolli (2008). 

For a proof of (9.101) see Bratteli & Robinson (1997, Lemma 6.2.21); this book 
is the bible about the KMS condition and its application to quantum spin systems. 

The proof of Proposition 9.25 is taken from Simon (1993), Lemma IV.4.1 and 
Proposition IV.4.2. The terminology of pure thermodynamical phases for primary 
KMS states (introduced after Theorem 9.31) is not completely standard; also ergodic 
states are sometimes called ‘pure phases’. 
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Chapter 10 

Spontaneous Symmetry Breaking 


As we shall see, the undeniable natural phenomenon of spontaneous symmetry 
breaking (SSB) seems to indicate a serious mismatch between theory and reality. 
This mismatch is well expressed by what is sometimes called Barman’s Principle: 

‘While idealizations are useful and, perhaps, even essential to progress in physics, a sound 
principle of interpretation would seem to be that no effect can be counted as a genuine 
physical effect if it disappears when the idealizations are removed.’ (Barman, 2004, p. 191) 

To describe the various examples apparently violating Barman’s Principle (and 
hence the link between theory and reality) in a general way (so general even that it 
will encapsulate the measurement problem), it is convenient to install a definition; 

Definition 10.1. Asymptotic emergence is the conjunction of three conditions: 

1. A higher-level theory H (which is often called a phenomenological theory or 
a reduced theory) is a limiting case of some theory! lower-level L (often called 
fundamental theory or a reducing theory). 

2. Theory H is well defined and understood by itself ( typically predating L). 

3. Theory H has features that cannot be explained by L, e.g. because L does not 
have any property inducing those feature(s) in the pertinent limit to H. 

In connection with SSB (as item 3.) we will look at the following pairs (H, L): 

• - H is classical mechanics (notably of a particle on the real line R); 

- L is quantum mechanics (on the pertinent Hilbert space L^(R)); 

- The limiting relationship between the two theories is as described in §7.1 (no¬ 
tably by the continuous bundle of C*-algebras (7.17) - (7.19) for n = 1). 

• - H is classical thermodynamics of a spin system; 

- L is statistical mechanics of a quantum spin system on a finite lattice; 

- Their limiting relationship is as described in §8.6 (cf. Theorem 8.4). 

• - H is statistical mechanics of an infinite quantum spin system; 

- L is statistical mechanics of a quantum spin system on a finite lattice; 

- The limiting relationship between H and L is given in §8.6 (cf. Theorem 8.8). 
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Of course, there are many other interesting example of (apparent) asymptotic emer¬ 
gence not treated in this book, such as geometric optics (as H) versus wave optics 
(as L), where the new feature of H would be the absence of interference of light 
rays—foreshadowing the measurement problem of quantum mechanics!— or hy¬ 
drodynamics (as H) versus molecular dynamics (as L), where the new feature is 
irreversibility. Perhaps space-time asymptotically emerges from quantum gravity. 

The “unexplained” features of H mentioned in the third part of Definition 10.1 are 
often called emergent, although this term has to be used with great care. Its meaning 
here reflects the original use of the term by the so-called “British Emergentists” 
(whose pioneer was J.S. Mill), as expressed in 1925 by C.D. Broad: 

‘The characteristic behaviour of the whole could not, even in theory, be deduced from the 
most complete knowledge of the behaviour of its components, taken separately or in other 
combinations, and of their proportions and arrangements in this whole. This is what I un¬ 
derstand by the ‘Theory of Emergence’. I cannot give a conclusive example of it, since it is 
a matter of controversy whether it actually applies to anything.’ (Broad, 1925, p. 59) 

In quotations like these, the notion “emergence” is meant to be the very opposite of 
the idea of “reduction” (or “mechanicism”, as Broad called it); in fact, for many au¬ 
thors this opposition seems to be the principal attraction of emergence. In principle, 
two rather different notions of reduction then lead (contrapositively) to two different 
kinds of emergence, which are sometimes mixed up but should be distinguished: 

1. The reduction of a whole (i.e., a composite system) to its parts; 

2. The reduction of a theory H to a theory L. 

In older literature concerned with the reduction of biology to chemistry (challenged 
by Mill) and of chemistry to physics (still contested by Broad), the first notion also 
referred to wholes consisting of a small number of particles. That notion of emer¬ 
gence seems a lost cause, since, as noted by Hempel, 

‘the properties of hydrogen include that of forming, if suitably combined with oxygen, a 
compound which is liquid, transparent, etc.’ (Hempel, 1965, p. 260) 

A similar comment applies to e.g. the tertiary structure of proteins, but also to cases 
of emergence such as ant hills, slime mold, and even large cities (Johnson, 2001), 
all of which are actually fascinating success stories for reductionism. 

More recently, the apparent possibility that very large assemblies of parts might 
give rise to emergent properties of the corresponding wholes has become increas¬ 
ingly popular, both in physics and in the philosophy of mind (where consciousness 
has been proposed as an emergent property of the brain). In physics, the modem dis¬ 
cussion on emergence in physics was initiated by RW. Anderson, who in a famous 
essay from 1972 called ‘More is different’ emphasized the possibility of emergence 
in very large systems (surprisingly, Anderson actually avoids the term ‘emergence’, 
instead speaking of ‘new laws’ and ‘a whole new conceptual structure’). In partic¬ 
ular, Anderson claimed SSB to be an example (if not the example) of emergence, 
duly adding that one really had to take the N ^ oo limit. Thus at least in physics, the 
interesting case for emergence in the first (i.e. whole-part) sense arises if the ‘whole’ 
is strictly infinite, as in the thermodynamic limit of quantum statistical mechanics. 
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This example confirms that 1. and 2. often go together, but they do not always do: 
the classical limit of quantum mechanics is a case of pure theory reduction. 

A clear description of emergence has also been given by Jaegwon Kim: 

1. Emergence of higher-level properties: All properties of higher-level entities arise out of 
the properties and relations that characterize their constituent parts. Some properties of 
these higher, complex systems are “emergent”, and the rest merely “resultant”. Instead 
of the expression “arise out of”, such expressions as “supervene on” and “are conse¬ 
quential upon” could have been used. In any case, the idea is that when appropriate 
lower-level conditions are realized in a higher-level system (that is, the parts that con¬ 
stitute the system come to be configured in a certain relational structure), the system 
will necessarily exhibit certain higher-level properties, and, moreover, that no higher- 
level property will appear unless an appropriate set of lower-level conditions is realized. 
Thus, “arise” and “supervene” are neutral with respect to the emergent/resultant distinc¬ 
tion: both emergent and resultant properties of a whole supervene on, or arise out of, 
its microstructural, or micro-based, properties. The distinction between properties that 
are emergent and those that are merely resultant is a central component of emergen- 
tism. As we have already seen, it is standard to characterize this distinction in terms of 
predictability and explainability. 

2. The unpredictability of emergent properties: Emergent properties are not predictable 
from exhaustive information concerning their “basal conditions”. In contrast, resultant 
properties are predictable from lower-level information. 

3. The unexplainability/irreducibility of emergent properties: Emergent properties, unlike 
those that are merely resultant, are neither explainable nor reducible in terms of their 
basal conditions.’ (Kim, 1999, p. 21, italics added) 

Similarly, Silberstein (2002) states (paraphrased) that a higher-level theory H: 

‘bears predictive/explanatory emergence with respect to some lower-level theory L if L 
cannot replace H, if H cannot be derived from L [i.e., L cannot reductively explain H], or 
if L cannot be shown to be isomorphic to H.’ 

A key point here is Kim’s no. 1: not even “emergentists” deny that the whole con¬ 
sists of its parts, or, in asymptotic emergence, that the higher-level theory H in fact 
originates from the lower-level theory L. The essence of emergence, then, would be 
that H nonetheless has “acquired” properties not reducible to L. One possibility for 
this to happen could be that the (allegedly) emergent property of H refers to some 
concept that does not even make sense in L, such as the experience of pain, which 
is hard to make sense of at a neural level, but another possibility, which is indeed 
the one relevant to physics and especially to SSB, is that some particular concept 
possessed by H (such as SSB) is admittedly defined within L, but banned. 

In describing the relationship between H and L we have to be clear about the 
difference between approximations and idealizations. Following Norton (2012): 

• An approximation is an inexact description of a target system. 

• An idealization is a fictitious system, distinct from the target system, some of 
whose properties provide an inexact description of aspects of the target system. 

Thus idealizations also provide approximations, but as systems they stand on their 
own and are defined independently of the target system. In our cases, the target 
system is a real physical system such as a ferromagnet or a quantum particle, which 
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is supposed to be described exactly by theory L, i.e., the lower-level theory. In fact, 
L is a family of theories parametrized by 1 /N (A^ G N) or /i G (0,1], and our real 
material relates to some very small value of this parameter (which may also be seen 
as a certain regime of L, seen as a single, unparametrized theory). 

The pertinent theory H is an idealization in the above sense, through which one 
approximates very large systems by infinite ones and highly semi-classical ones 
(where % is very small) by classical ones (where h = 0). It is in this setting that 
asymptotic emergence would violate Barman’s Principle and hence would blast the 
relationship between theory and reality: the abstract point (made concrete for SSB 
earlier on) is that if some real property of a real system is described by H but is not 
approximated in any sense by L in any regime (as is the threat with SSB), although H 
is supposed to be a limit of L, then the latter theory L fails to describe the real system 
it is supposed to describe, whereas this systems is described by the theory H, which 
portrays fictitious systems. This marks a difference with other cases of emergence, 
where H (including some “whole”) is not an idealization but a real system itself (as 
might be the case with consciousness and other examples from neuroscience and 
the philosophy of mind). Thus our discussion does not apply to such cases. 

The tension between SSB and Barman’s Principle has not quite gone unnoticed in 
the philosophy of physics literature. Bor example, Biu and Bmch (2005) first write 
that it is a mistake to regard idealizations as acts of ‘neglecting the negligible’ (p. 
155, which already appears to deny Barman’s Principle), and continue by: 

“The broken symmetry in question is not reducible to the configurations of the microscopic 
parts of any finite systems; but it should supervene on them in the sense that for any two 
systems that have the exactly (sic) duplicates of parts and configurations, both will have the 
same spontaneous symmetry breaking in them because both will behave identically in the 
limit. In other words, the result of the macroscopic limit is determined by the non-relational 
properties of parts of the finite system in question.” (Liu & Emch, 2005, p. 156) 

It is not easy to make sense of this, but the authors genuinely seem to believe in 
asymptotic emergence and hence they (again) appear to deny Barman’s Principle. 
Another suggestion, made by Ruetsche, is to modify Barman’s Principle to: 

‘No effect predicted by a non-final theory can be counted as a genuine physical effect if it 
disappears from that theory’s successors.’ (Ruetsche, 2011, p. 336) 

Bor example, the theory L explaining SSB should not be quantum statistical mechan¬ 
ics but quantum field theory (which has an infinite number of ultraviolet degrees of 
freedom even in finite volume, and hence in principle allows SSB). This does make 
sense within physics, but, as Ruetsche herself notices, her principle ‘has the prag¬ 
matic shortcoming that we can’t apply it until we know what (all) successors to our 
present theories are.’ With due respect, we will describe a rather different way out, 
based on unexpectedly implementing Butterfield’s Principle, which is a corollary 
to Barman’s Principle that removes the reduction-emergence opposition: 

‘there is a weaker, yet still vivid, novel and robust behaviour that occurs before we get to 
the limit, i.e. for finite N. And it is this weaker behaviour which is physically real.’ 

(Butterfield, 2011, p. 1065) 

To do so, we now turn our attention to specific (classes of) models of SSB. 
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10.1 Spontaneous symmetry breaking: The double well 


The simplest example of SSB is undoubtedly the equation = 1 (where x G C), 
which is invariant under a Z 2 symmetry given by x 1 -^ —x. Its solutions x = ±1, 
then, do not share this symmetry; instead Z 2 acts nontrivially on the solution space. 

Another example that is simple at least compared to quantum spin systems is 
provided by elementary quantum mechanics. Thus we are now in the context of the 
first of the three pairs (H, L) listed in the preamble to this chapter, where, in detail; 

H is classical mechanics of a particle moving on the real line, with associated 
phase space = {(p,q)} and ensuing C*-algebra of observables Aq = Co(M^); 

L is the corresponding quantum theory, with a C*-algebra of observables A/j 
ih > 0) taken to be the compact operators Bo(L^(K)) on the Hilbert space L^(]R); 

The relationship between H and L is given by the continuous bundle of C*- 
algebras (7.17) - (7.19), for n = 1, notably in the classical limit h^Q. 

At the level of states, the passage to the classical limit h ^ 0 of any /i-dependent 
wave-function Xj/f, G T^(M), if it exists, is described via the associated probability 
measure on which is defined by (7.31); in other words, 

= ( 10 . 1 ) 

where the (Schrodinger) coherent states G L^(]R) are given by {121), i.e.. 


(x) = llh _ 2) 

In terms of the associated vector states (0^^ on the C*-algebra Bo{L?'(R.j), one has 
0)y,,{Qf{f)) = {\j/h,Qnif)Wn)= I' dp^,{p,q) f{p,q), (10.3) 

where / G Co(IR^). We then say that the wave-functions Xj/p, have a classical limit if 


lim [ dp^f,f= [ dpof, (10.4) 

for any / G Co(R^), where po is some probability measure on Seen as a state (Oq 
on the classical C*-algebra of observables Co(R^), the probability measure po is re¬ 
garded as the classical limit of the family cOy/j^ of states on the C*-algebra Bo{L?'{M.j) 
of quantum-mechanical observables. This family is continuous in the sense that the 
function h 1 — (Oyff^{cy{h)) from [0,1] to C is continuous for every continuous cross- 
section (7 of the given bundle of C*-algebras. An example of such a continuous 
cross-section is (j(0) = / and cy{h) = Q^{f), for any / G Co(R^)), cf. (C.550) - 
(C.551), and indeed this example reproduces (10.4), which after all is just 

lm^(ef(/)) = o^(/) (/gCo(K2)). (10.5) 
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First, let us illustrate this formalism for the ground state of the one-dimensional 
harmonic oscillator. Taking m = 1/2 and V (x) = in the usual Hamiltonian 

cf 

hn = -h^-^+V{x), ( 10 . 6 ) 

it is well known that the ground state is unique and that its wave-function, i.e., 

is a Gaussian, peaked above x = Q. Ash^Q, this ground state has a classical limit, 
namely the Dirac measure /Tq concentrated at the origin {p = Q,q = 0), i.e., 

limj dfiy,J = mO) (/gCo(R2)). (10.8) 

This is just the unique ground state of the corresponding classical Hamiltonian 

ho{p,q)=P^+V{q), (10.9) 

seen as a point in the phase space minimizing ho, reinterpreted as a probability 
measure on phase space as explained in the context of Theorem 3.3. Note that we 
kept the mass fixed at m = 1/2, but instead we could have kept h fixed and take the 
limit m —oo instead ofh^Q-, cf. the preamble to Chapter 7. 

The same features hold for the anharmonic oscillator (with small X > 0), i.e., 

y(x) = icuV-f (10.10) 

However, a new situation arises for the symmetric double-well potential 

V{x) = — -I- IXx"^ + jCU^^/A = \X{x^ — (10.11) 

where a = (oj'/X > 0 (assuming cu > 0 as well as A > 0). This time, the ground 
state of the classical Hamiltonian is doubly degenerate, being given by the points 
{p = 0,q = ±fl) G with ensuing Dirac measures given by 

I dp^f = f{0,±a). (10.12) 

But it is a deep and counterintuitive fact of quantum theory that the corresponding 
quantum Hamiltonian (10.6) with (10.11) has a unique ground state. Indeed: 

Theorem 10.2. Let V G L|/j(,(]R'”) be positive and suppose that lim|;t|^„y (x) = 
Then —A -\-V has a nondegenerate (and strictly positive) ground state. 

Roughly speaking, the proof is based on an infinite-dimensional version of the 
Perron-Frobenius Theorem in linear algebra (applied to exp(—rather than to 
the Hamiltonian hp, itself, so that the largest eigenvalue of the former corresponds to 
the smallest eigenvalue of the latter, i.e., the energy of the ground state). 
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And yet there are two quantum-mechanical shadows of the classical degeneracy: 

• The wave-function of the ground state (which by a suitable choice of phase 
may be taken to be real) is positive definite and has two peaks, above x = ±a, with 
exponential decay |v4*'^(t)| ^ exp(—l//i) in the classically forbidden region. 

• Energy eigenfunctions (and the associated eigenvalues) come in pairs. 

In what follows, we will be especially interested in the first excited state which 

like is real, but has one peak above x = a and another peak below x = —a. See 
Figure 10.1. The eigenvalue splitting (or “gap”) vanishes exponentially in—l/Ti like 

= -Ef (h^O), (10.13) 


where the typical WKB-factor is given by 


dv 


[ dx i/y (x). 

J—a 


(10.14) 


Also, the probability density of each of the wave-functions y/^'^ or contains ap¬ 
proximate 5-function peaks above both classical minima ±a. See Figure 10.2, dis¬ 
played just for y/^\ the other being analogous. We can make the correspondence be¬ 
tween the nondegenerate pair {yf^\ v4*^) of low-lying quantum-mechanical wave- 
functions and the pair of degenerate classical ground states more trans¬ 

parent by invoking the above notion of a classical limit of states. Indeed, in terms of 
the corresponding algebraic states (O (o) and (O (p, one has 

1 ^ 0 ^ = + ( 10 - 16 ) 

where are the pure classical ground states (10.12) of the double-well Hamil¬ 
tonian. To see this, one may consider numerically computed Husimi functions, as 
shown in Figure 10.3 (just for y/^\ as before). From this, it is clear that the pure 

(algebraic) quantum ground state converges to the mixed classical state (10.16). 
In contrast, the localized (but now time-dependent) wave-functions 




(10.17) 


which of course define pure states as well, converge to pure classical states, i.e.. 


limn 


( 10 . 18 ) 


In conclusion, one has SSB in H, but at first sight the underlying theory L seems to 
forbid it. Yet we will now show that (10.17) - (10.18), will save Earman’s Principle. 
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Fig. 10.1 Double-well potential with ground state v4% 5 excited state v4=o 5 • 




Fig. 10.2 Probability densities for ^ j (left) and g gj (right). 



~PiUi.c. T^^LaiStxiMtxLtljcjaJ. 
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10.2 Spontaneous symmetry breaking: The flea 


Regarding the doubly-peaked ground state of the symmetric double well as 
the quantum-mechanical counterpart of a hung parliament, the analogue of a small 
party that decides which coalition is formed is a tiny asymmetric perturbation 
5V of the potential. Indeed, the following spectacular phenomenon in the theory 
of Schrodinger operators was discovered in 1981 by Jona-Lasinio, Martinelli and 
Scoppola. In view of the extensive (and very complicated) ensuing mathematical 
literature, we just take it as our goal to explain the main idea in a heuristic way. 
Replace V in (10.6) by V -f 5V, where 5V (i.e., the “flea”) is assumed to; 

1. Be real-valued with fixed sign, and C~ (hence bounded) with connected support 
not including the minima x = a orx = —a; 

2. Satisfy |5y | >> fQj- sufficiently small k (e.g., by being independent of /i); 

3. Be localized not too far from at least one the minima, in the following sense. 
First, fory,z € R and A C M, we extend the notation (10.14) to 


dv(y,z) = 


J dxsjv (x) 


dv{y,A) = inf{t/v(y,z),zG A}. 


(10.19) 

( 10 . 20 ) 


Second, we introduce the symbols 

d'y = 2-min{c/v(—a,supp 5y),c/v(a,supp 5y)}; (10.21) 

dy = 2-max{c/;/(—a,supp 5y),c/v(fl,supp 5y)}. (10.22) 


The localization assumption on dV is that one of the following conditions holds: 

dy<dy<dy-, (10.23) 

dy<dy<dy. (10.24) 


In the first case, the perturbation is typically localized either on the left or on the 
right edge of the double well, whereas in the second it resides on the middle bump 
(symmetric perturbations are excluded by 3, as these would satisfy d'y = d'y). 

Under these assumptions, the ground state wave-function of the perturbed 
Hamiltonian (which had two peaks for 5V = 0!) localizes as /i 0, in a direction 
which given that localization happens may be understood from energetic consider¬ 
ations. For example, if 5V is positive and is localized to the right, then the relative 
energy in the left-hand part of the double well is lowered, so that localization will 
be to the left. See Figures 10.4 - 10.6. Eqs. (10.17) - (10.18) then yield Butterfield’s 
Principle (with N 1/^), so that also Barman’s Principle is saved; the essence of 
the argument is that (at least in the presence of a flea-perturbation) SSB is already 
foreshadowed in quantum mechanics/or small yet positive h, if only approximately. 
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Fig. 10.4 Flea perturbation of ground state v4f:() 5 corresponding Flusimi function. For such 
relative large values of h, little (but some) localization takes place. 



Fig. 10.5 Same at S = 0.01. For such small values of h, localization is almost total. 



Fig. 10.6 First excited state for ti = 0.01. Note the opposite localization area. 
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In more detail, for the perturbed ground state we have (subject to assumptions 1-3): 


^ ~ (±5y > 0, supp(y) c K+); (10.25) 

v4 (-«) 

It, 

^ ~ (±5y > 0, supp(y) C R-), (10.26) 

V4 (-a) 


with the opposite localization for the perturbed first excited state (so as to remain 
orthogonal to the ground state). A more precise version of the energetics used above 
is as follows. The ground state tries to minimize its energy according to the rules: 

• The cost of localization (if dV = 0) is 1^). 

• The cost of turning on 5V is when the wave-function is delocalized. 

• The cost of turning on 5V is when the wave-function is localized in 

the well around xq = ±a for which dy (xo, supp 5V) =dy. 

In any case, these results only depend on the support of 5V, but not on its size: this 
means that the tiniest of perturbations may cause collapse in the classical limit. 

Although the collapse of the perturbed ground state for small It is a mathematical 
theorem, it remains enigmatic. Indeed, despite the fact that in quantum theory the 
localizing effect of the flea is enhanced for small h, the corresponding classical 
system has no analogue of it. Trivially, a classical particle residing at one of the two 
minima of the double well at zero (or small) velocity, i.e., in one of its degenerate 
ground states, will not even notice the flea; the ground states are unchanged. But 
even under a stochastic perturbation, which leads to a nonzero probability for the 
particle to be driven from one ground state to the other in finite time (as some form 
of classical “tunneling”, where in this case the necessary fluctuations come from 
Brownian motion), the flea plays a negligible role. For example, in the case at hand 
the standard Eyring-Kramers formula for the mean transition time reads 


(t) 


2k 

^V”{a)V”{0) 


ek(0)/£^ 


(10.27) 


where e is the parameter in the Langevin equation dxt = — Vy {xt)dt + y/2edWt, in 
which Wt is standard Brownian motion. Clearly, this expression only contains the 
height of the potential at its maximum and its curvature at its critical points; most 
perturbations satisfying assumptions 1-3 above do not affect these quantities. 

The instability of the ground state of the double-well potential under “flea” per¬ 
turbations as /i —0 is easy to understand (at least heuristically) if one truncates the 
infinite-dimensional Hilbert space L^(R) to a two-level system. This simplification 
is accomplished by keeping only the lowest energy states and in which 
case the full Hamiltonian (10.6) with (10.11) is reduced to the 2x2 matrix 


Ho = k 




(10.28) 
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with 4 > 0 given by (10.13). Dropping h, the eigenstates of Hq are given by 



1 

Vi 



1 

Vi 



(10.29) 


with energies Eq = —\A and Ei = \A, respectively; in particular, Ei—E{) = A. If 


±_ <pr±<Po‘’ 

ro “ 


V2 


(10.30) 


as in (10.17), then 

<=(?)> ( 10 - 31 ) 

Hence in this approximation (p^ and <Pg play the role of wave-functions (10.17) 
localized above the classical minima x = +a and x = —a, respectively, with classical 
limits The “flea” is introduced as follows. If its support is in 1R+, we put 


5+y = 



(10.32) 


where 5 G K is a constant. A perturbation with support in R is approximated by 


= 



(10.33) 


Without loss of generality, take the latter (a change of sign of 5 leads to the former). 
The eigenvalues of = Ho + S^V are £0 = £_ and £1 = £+, with energies 

£± = i(5± V52+A2), (10.34) 

and normalized eigenvectors 

(s + vV+3^)- 

o-r’=V(«' 

Note that lim^^o^g) = <Pq^ for i = 0,1. Now, if /i —> 0, then |5| >> A, in which 
case > <p^ for ±5 > 0 (and starting from (10.32) instead of (10.33) would have 

given the opposite case, i.e., —>■ <p^ for ±5 > 0). Thus the ground state localizes 

as 0, which resembles the situation (10.25) - (10.26) for the full double-well. 

In conclusion, in the (practically unavoidable) presence of asymmetric “flea” per¬ 
turbations, explicit (rather than spontaneous) symmetry breaking already takes place 
for positive h, so that Butterfield’s Principle holds, and hence also Barman’s. 
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10.3 Spontaneous symmetry breaking in quantum spin systems 

Before discussing SSB in quantum spin systems, we return to ground states and 
KMS states as discussed in the generality of §§9.4—9.6. Starting with the former, it 
is natural to ask whether ground states are pure, as would be expected on physical 
grounds; indeed, this question goes to the heart of SSB. Proposition 9.20 implies 
that ground states (for given dynamics) form a compact convex subset S{A)oo of the 
total state space 5(A); the notation 5oo(A) (rather than e.g. 5o(A)) will be motivated 
shortly by the analogy with equilibrium states. It would be desirable that 

5,5„o(A) =5„o(A)n5^5(A), (10.37) 

in which case extreme ground states are necessarily pure. This will indeed be the 
case in the simple models we study in this book, but it is provably the case in gen¬ 
eral only under additional assumptions, such as weak asymptotic abeliannnes of the 
dynamics, i.e., lim^^oo Co{[at{a),b]) = 0 for all a,b GA. A weaker sufficient condi¬ 
tion for (10.37) is that TtioiA)' be commutative (which is the case if CO is pure). 

We are now in a position to define SSB, at least in the context of ground states. 

Definition 10.3. Suppose we have a (topological) group G and a (continuous) ho¬ 
momorphism y. G ^ Aut(A), which is a symmetry of the dynamics in that 

atoyg = ygoat (gGGfGR). (10.38) 

The G-symmetry is said to be spontaneously broken (at temperature T = 0) if 

(5,5oc(A))« = 0, (10.39) 

and weakly broken if (dgSooiA))^ f d^SooiA), i.e., there is at least one (O € d^SooiA) 
that fails to be G-invariant (although invariant extreme ground states may exist). 

Here = {<0 € j CO o yg = coVg € G}, defined for any subset C 5(A), is the 
set of G-invariant states in Assuming (10.37), eq. (10.39) means that there are 
no pure G-invariant ground states. This by no means implies that there are no G- 
invariant ground states at all, quite to the contrary: for compact, or, more generally, 
amenable groups G, one can always construct G-invariant ground states by averag¬ 
ing over G, exploiting the fact that if G is a symmetry of the dynamics, then each 
affine homeomorphism of 5(A) (defined by y^ (co) = cooyg) maps 5oc(A) to itself. 
Definition 10.3 therefore implies that if SSB occurs, then one has a dichotomy: 

• Pure ground states are not invariant, whilst invariant ground states are not pure. 

Definition 10.4. We call a G-symmetry spontaneously broken at inverse tempera¬ 
ture fi G (0,oo) if there are no G-invariant extreme jS-KMS states, i.e., 

{deSp{A))^ = (d, (10.40) 

and weakly broken if there is at least one non-G-invariant extreme KMS state. 
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By Theorem 9.31 we may replace extreme jS-KMS states by primary jS-KMS states, 
so that, similarly to ground states, SSB at nonzero temperature means that: 

• Primary KMS states are invariant, whilst invariant KMS states are not primary. 

For the next result, please recall Definition 9.10 and Theorem 9.11. 

Proposition 10.5. Let A be a quasi-local C*-algebra of the kind (8.130) and sup¬ 
pose the given G-action J commutes not only with time translations at but also 
with space translation tx- If Y^(0 f CO for some CO G deSp{A) and g € G, then the 
automorphism Yg cannot be unitarily implemented in the GtCS-representation TT®. 

This is true also at j3 = 0 °, i.e., for ground states. 

Proof This is an obvious corollary of Proposition 9.13 and Theorems 9.14 and 
9.31: if Yg were implementable by a unitary Ug, then Ugi2(t, f Qod (not even up to 
a phase), since 0). But in that case, since ixO Yg = Yg° Fr for each x G Tf, 

we would have UxUg = UgUx and hence Ux{ugQ(o) = UgLim- Thus UgQa would be 
another translation-invariant ground state, contradicting Theorem 9.14. □ 

This result is worth mentioning, since some authors define SSB through the con¬ 
clusion of this proposition, that is, they call a symmetry Yg (spontaneously) broken 
by some state co iff Yg cannot be unitarily implemented in nm- This definition seems 
physically dubious, however, because quantum spin systems may have ground states 
CO that are not G-invariant but in which nonetheless all of G is unitarily imple¬ 
mentable (in such states translation invariance has to be broken, of course). For 
example, the Ising model in c/ = 1 with ferromagnetic nearest-neighbour interaction 
and vanishing external magnetic field (where G = Zf) has an infinite number of 
such ground states, in which a “domain wall” separates infinitely many “spins up” 
to the left from infinitely many “spins down” to the right. Although this model has a 
unique KMS state at any nonzero temperature, such ground states (and perhaps anal¬ 
ogous states at j3 different models, so far understood only heuristically) seem 

far from pathological and play a major role in modem condensed matter physics. 
Hence we trust this alternative definition only if the states it singles out also satisfy 
Definition 10.3 or 10.4, for which Proposition 10.5 gives a sufficient condition: for 
translation-invariant states and symmetries on quasi-local algebras, our definition of 
SSB through (10.40) is compatible with the one based on unitary implementability. 

This is fortunate, since the physicist’s notion of an order parameter, through 
which at least weak SSB may be detected, is tailored to translation-invariant states: 

Definition 10.6. Let A be a quasi-local C*-algebra A as in (8.130), with symmetry 
group G. A (strong) order parameter in A is an n-tuple (j) = (0i,..., 0„) G A" for 
which 0 ( 0 ) =0 if (and only if) CO is G-invariant, for any Z‘^-invariant state CO on A. 

An order parameter defines an accompanying vector field x 1 -^ 0 (x) by 0, (x) = 
Txicj)). Since co is translation-invariant, a)(0) = 0 is equivalent to a)(^(x)) =0 for 
all X. In the Ising model, with G = Z2, 03(0) is an order parameter, which can 
be extended to a strong one (j) — (a2(0),(73(0)). In the Heisenberg model, where 
G = SO{3), the triple (ci (0), 02 ( 0 ), ( 73 ( 0 )) provides a strong order parameter. 
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Theorem 10.7. Suppose that (j) is a (strong) order parameter, as in Definition 10.6. 
Then a G-invariant and translation-invariant KMS state (0 £ Sp{A)^ (including j3 = 
i.e., a ground state) displays weak SSB — in the sense that at least one of the 
components in its extremal decomposition fails to be G-invariant—if (and only if) 
the associated two-point function exhibits long-range order, in that 


lim 0 ) 

X-^oo 




u=l 


> 0 . 


(10.41) 


Proof. The “if” part of the theorem is equivalent to the vanishing of the limit in 
question in the absence of SSB. Let (9.132) be the extremal decomposition of co. If 
(almost) each extreme state (p is invariant, then co'((pi(x)) = 0 for all i by definition of 
an order parameter, and similarly (»'(0, (x)*) = (»'(0, (x)) = 0. Interchanging limj:^oo 
with the integral over dSp{A) (which is allowed because /r is a probability measure), 
and using (9.30) then shows that the left-hand side of (10.41) vanishes. 

To avoid difficult measure-theoretic aspects of the extremal decomposition the¬ 
ory, and also for pedagogical purposes, we prove the “only if” part only in the case 

co= f dgco'g, (10.42) 

J G 

weakly, where (o' £ dSp{A) and (o'g — Since the expression 

!=1 


is independent of g G G (by definition of an order parameter), we may replace cu' by 
(o' in the expression for O); the term Jpc/g then factors out and is equal to unity. Thus 
we may replace (O in (10.41) by (o'. Since (o' is a primary state, we may now use 
(9.30) once again, so that the left-hand side of (10.41) becomes YIi=\ By 

assumption, (o' is not G-invariant, so that (by definition of a strong order parameter) 
at least one of the terms is nonzero. □ 


If G is compact, for any C*-algebra A, invariant KMS states (including ground 
states) can always be constructed via (9.133), provided, of course, KMS states (or 
ground states) exist in the first place. Fortunately, existence can be shown in the 
following way. Let A be a quasi-local C*-algebra a la (8.130), in which; 

1. dim(//) < oo (and hence also dim(//y\) < for any finite A C Z'^); 

2. Dynamics is defined locally on each algebra Ayv = B{Ha) via (9.40) and (9.41), 
i.e., with free boundary conditions, having a global limit a as in Theorem 9.15. 

In that case, by Corollary 9.27 each C*-algebra A^ has a unique jS-KMS state (O^, 
given by the local Gibbs state (9.96). However, if A^B c then the restriction 
of the jS-KMS state ®^( 2 ) to A^(i) C A^( 2 ) is not given as naively expected, namely 
by the jS-KMS state because the former involves boundary terms. 
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Fortunately, this complication may be overcome, since at least/or models with 
short-range forces (cf. Theorem 9.15) one may put 

(0^{a)= \im (0^ (a), (10.43) 

N—foo •''' 

where is defined in (8.153). This limit exists for a € U^A^, from which co^ 
extends by continuity to all of A, on which it is a jS-KMS state (cf. Theorem 10.10). 

Alternatively, by the Hahn-Banach Theorem (in the form of Corollary B.41) 
combined with Lemma C.4 (which guarantees that any Hahn-Banach extension of 
a state remains a state), each local Gibbs state co^ on Ay^ C A extends, in a non¬ 
unique way, to a state d)^ on A. This gives a net of states (wf) on A indexed by 
the finite subsets A of one may also work with sequences Since A has 

a unit, its state space 5(A) is a compact convex set, so the above net (or sequence) 
has at least one limit point, or, equivalently, has at least one convergent subnet (or 
subsequence), which—despite its potential lack of uniqueness in two respects, i.e. 
the choice of the extensions d)^ and the choice of a limit point—one might write as 

(B^= lim d)^. (10.44) 

Without proof, we quote the relevant technical result (assuming 1-2 above): 
Proposition 10.8. Each limit state d)^ is a j3-KMS state (i.e. for the dynamics a). 
Anticipating the existence of SSB in models, one should now feel a little uneasy: 

• It follows from Corollary 9.27 that (at fixed j3) there is a unique KMS state on 

each local algebra Ay^ for the given local dynamics (Xf^\ namely the local Gibbs 
state 0 )^ on Ay^. If—as is the case in all our examples—the globally broken G- 
symmetry is induced by local automorphisms : Ayv —Aa that commute with 

the local dynamics then each local Gibbs state is G-invariant: this follows 
explicitly from G-invariance of the local Hamiltonian h^ and the formulae (9.96) 
- (9.98), or, more abstractly, from the fact if (O^ were not invariant under all 

it would not be unique (as its translate co^ o would be another KMS state). 

• And yet (in case of SSB) there exist non-invariant (and hence non-unique) KMS 
states on A, which are even limits in the sense of (10.44) of the above invariant 
(and hence unique) local KMS states on Aa ! 

• Real samples are finite and hence are described by the local algebras Aa , with 
their unique invariant equilibrium states (O^. Yet finite samples do display SSB, 
e.g., ferromagnetism (broken Z 2 -symmetry), superconductivity (broken U{1)). 

• Therefore, the theory that should describe SSB in real materials, namely the finite 
theory Aa, apparently fails to do so (as it seems to forbid SSB), whereas the 
idealized theory A, which describes strictly infinite systems and in those systems 
allows SSB, in fact turns out to describe key properties of finite samples. 
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10.4 Spontaneous symmetry breaking for short-range forces 


We continue our discussion of SSB in quantum spin systems, especially of the con¬ 
struction of global KMS states in the previous section, see (10.44) and preceding 
text. Recall that each finite system has a unique jS-KMS state O)^, namely the lo¬ 
cal Gibbs state (9.96), but that these states are incompatible for different A ’s, in that, 
if C then the restriction of 0)^(2) to A^(i) C A^( 2 ) is not given by ®^(i) 
because of boundary terms. To correct for this, one introduces the surface energy 

^a(i),aP) = E ( 10-45) 

XCA p) :XnA(l) 7^0, XnAf 7^0 

with ensuing interaction energy 

bA= lim Y. (10-46) 

AW/'zd > xnA7^0,xnAV0 


provided this limit exists (which it does for short-range forces). Now perturb ®^( 2 ) 
by replacing h^{ 2 ) in (9.96) - (9.98) (with A A (^J) by h^( 2 ) - ^( 2 ) • Denoting 

this modification of ®^( 2 ) by ®^(i) ^( 2 )> ws obtain (10.47), which implies (10.48): 


'^a(i),aP) 


'"a(1)'^®A\A(1)’ 


*^®A(l).A(2))|'tA(l) “ ®A(1)- 


(10.47) 

(10.48) 


If (10.46) exists, we may likewise perturb any f-invariant state O) on A to S)a, i-C-, 


®A(a) 


\\e-Pibo,-7ro>ibA))/2Q^'^\\2 


(10.49) 


where A C is finite, is defined as in (9.51) - (9.52), and Q(o is in the domain 
of the unbounded operator exp(—j3(/Z(B — TtmibA ))the reason is that itmibA) is 
bounded, whereas exp(—ph((,/2)Qa> = (since = 0). For example, 

(®A(2)^A(1) = ®^(l)_^(2)J (10.50) 


where (O ~ ®^( 2 ) i® ^ Gibbs state on A = A^( 2 ), as in Theorem 9.24 (with A A(2)). 
Indeed, using (9.114) - (9.117) and the relation /z® = h^( 2 ) —Jh^( 2 )J, where the 
operator J is defined in (9.124), we compute the numerator in (10.49) as 


Tr (( -^'‘a(2)^-'’a)/ 2^-/3/>^(2)/2y ^^-/3(/.^(2)-//.^(2)4 -^>a)/2^-/^/.^(2)/2^ 

= Ti (e^l^^''A(2)-‘’A)ay (10.51) 
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since Jh^( 2 )J commutes with h^( 2 ) — bA ■ This subsequently gives 

= e-^'‘A{2)/2/''A(2)/2 = (10.52) 

Likewise, the denominator in (10.49) equals Tr(exp(—j3(/tA(2) — bA)))- 

Eqs. (10.50) and (10.48) suggest that if O) = 0 )^ is a jS-KMS state, then although 
itself does not localizes to a Gibbs state co^ on Aa, its perturbed version cb^ 
does. Under assumptions 1-2 stated in §10.3, i.e., in the situation of Theorem 9.15 
with dim(//) < oo, this motivates the following quantum analogue of the DLR ap¬ 
proach to classical equilibrium states, i.e., of Definition 9.23: 

Definition 10.9. For fixed inverse temperature j3 G K\{0} and fixed interaction <1>, 
a Gibbs state on a quasi-local algebra A with dynamics given by some potential 
<P is an Ut-independent state such that for each finite region A <z7/ one has 

d)^ = CB^®tB;., (10.53) 

where CB^ is the local Gibbs state (9.96) on Aa and (o'^c is some state on Aa^- 

Theorem 10.10. Under assumptions 1-2 in %10.3, and if in addition the subspace 
D = GaAa <Z a is a core for the derivation (9.54) (i.e., the closure of 5 defined on 
D is 5 as defined in Proposition 9.19), then Gibbs states coincide with KMS states. 

The proof is rather technical and so we omit it. It follows that if CB^ G Sp (A), then 

(®a)|Aa=®^- (10-54) 

Even so, we still need to define in precisely which sense the net ((®a)|Aa)a 

verges to (Oa (or when perhaps even the net (CB^) converges to cBa); for simplicity 

we take A = An as in (8.153), and just consider sequences indexed by N (rather 

than nets). To this end, let {(Oi/n)n be a sequence of states with (Oi/n G As 

in Definition 8.24, given some (Bq G 5(A) (if it exists), we say that 

limCBi/A = ft)o (10.55) 

iff for any sequence in A with aj/a/ G Aa^, C A that converges to a G A one 

has 

lim CBi/A(fli/A^) = CBo(fl). (10.56) 

Eor example, if we take cBq G 5(A) and define CBj/a? = ®0|Aa ’ (10.55) holds by 

continuity of CBo (as ||cBo|| = 1), which implies that limA^-ooCBD(ai/A) = (Oo{o). 

It follows from the comments preceding Definition 8.24 that the above notion 
(10.55) - (10.56) of convergence is the same as the one given by (8.164), so that it is 
similar to the convergence of states we defined for the other two classes of examples 
of listed earlier, viz. classical mechanics (cf. §10.1) and thermodynamics. 
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We denote the restriction of some global KMS state (O^ (defined on A) to Ayv^ C A 
by whereas as usual we write for the unique local Gibbs state on Ay^^. 

Keeping Definition 8.24 and Proposition 8.25 in mind, the situation is as follows: 

1. Any KMS state equals the limit cOq of its restrictions (i.e. to Ayi^). 

2. Each state differs from the local Gibbs state (even if (O^ is unique). 

3. The local Gibbs states (0^^ typically converge to a KMS state (Oq, cf. (10.43). 

4. In models with symmetry, this global Gibbs state (Oq is invariant (like the co^^). 

The first claim follows from the argument given after (10.55). The second is the 
contrapositive to (10.54) and has been explained in §10.3: although the states 

and are both of local Gibbs type, their Hamiltonians differ from /zy^^ by the 
boundary term bj\. The third claim cannot be proved in general, but in models with 
short-range forces it holds in both forms (10.43) and (10.55) - (10.56). In such mod¬ 
els the G-symmetry is local, i.e., G acts on each Ayi through unitaries 

Ug'^^ = (SixeAUg{x)', (10.57) 

r^\aA) = (flA €AA,g€G), (10.58) 

where Ug(x) G B{Hx), leaving each local Hamiltonian hA and hence each local Gibbs 
state (0^^ invariant. If a G A is local, i.e., a G UaAa, then 

Yg{a) = lim Y^^^\aN), (10.59) 

followed by continuous extension to a G A, so that, assuming (10.55), 

(Oo{Yg{a)) = lim couAiYgiaN)) = lim = lim couNiaN) = coo(a), 

since o by assumption. Thus the global Gibbs state (Oq inherits 

the G-invariance of its local approximants O)^^. In case of SSB, the restrictions 
of some non-invariant extreme KMS state O)^ determine co^, so that in principle SSB 
is detectable through the local states It would be question-begging to construct 

the latter from the global states co^, though, so Butterfield’s Principle (and hence in 
its wake Barman’s Principle) holds only if we can show how and why the states of 
sufficiently large yet finite systems Ay\^ tend to rather than to 

Unfortunately, showing any of this in specific models at finite (inverse) temper¬ 
ature 0 < j3 < 00 is pretty complicated. For example, in the quantum Ising model 
(9.42) in c/ = 1, KMS states are unique for any B, so that for SSB one must go to 
d >2.ln that case, it can be shown from Theorem 10.7 that for B — 0, below some 
critical temperature (i.e. for j3 > the Z 2 symmetry defined in (10.68) below is 
broken, but this takes considerable effort and is beyond the scope of this book. 
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10.5 Ground state(s) of the quantum Ising chain 

It is much simpler to put j3 and hence turn to the ground state(s) of the quantum 
Ising model (9.42) 'md= 1, which is manageable. The interesting case is B > 0, with 
7=1 and free boundary conditions, so that for A = (with N even), we have 


% = - ^ ((73(x)(73(x+ 1)-TB(7i(x)); 

(10.60) 



An = {-iiV,...,iiV-l}; 

(10.61) 

^A^ — Hn — ^x^A]\jtix'i 

(10.62) 

Hx = (x G An), 

(10.63) 


where the operator (7j (x) acts as the Pauli matrix O’, on Hx and as the unit matrix 
I 2 elsewhere. This model describes a chain of N immobile spin-j particles with 
ferromagnetic coupling in a transverse magnetic field (it is a special case of the so- 
called XY -model, to which similar conclusions apply). The local Hamiltonians % 
define time evolution on the local algebras 


A An =An = B{Hn) 


(10.64) 


by (9.40), i.e., 

{oGAn), (10.65) 

which by Theorem 9.15 defines a time evolution on the quasi-local C*-algebra 


A=UA^ =^BiHx), (10.66) 

Neti xeZ 

namely by regarding the unitaries exp{ithN) € A^i C A as elements of A and putting 
a,{a)= (aGA), (10.67) 

N^OO 

which exists (although the sequence {exp{ithfj))jN in A does not converge in A). 

For any B G M, the quantum Ising chain has a Z 2 -symmetry given by a 180- 
degree rotation around the x-axis, locally implemented by the unitary operator 
u{x) = O’i(x), which at each x G An yields ((7i,(72,0’3) i-G {ai,— 02 ,- 03 ), since 
OiOjO* = —Oj for i ^ j. Thus m ( x ) sends each O^ix) to — 03 {x) but maps each 
0 \ (x) to itself. As in (10.57), this symmetry is implemented by the unitary operator 

u^^^=®xeAN(y\{x) ( 10 . 68 ) 

on Hn, which satisfies [%, = 0, or, equivalently, 

=%. (10.69) 
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The ensuing Z 2 -symmetry is given by the automorphism 7 ^^^ of defined by 

7 W(a) =MWa(MW)* (oGAa,), (10.70) 

which induces a global automorphism ys Aut(A) as in (10.59), i.e., 

7 (fl) = lim MWa(MW)* (aGA), (10.71) 

N^OC 

which limit once again exists despite the fact that the sequence has no limit in 
A. Thus Z 2 -invariance of the model follows from the local property 

o 7 ^^) = 7 ^^) o , (10.72) 

which in the limit N ^ gives 

ctr 07 = 70 a, (f G M). (10.73) 

Since 7 ^ = id^i, we have an action of the group Z 2 = {—1,1} on A, where the 
nontrivial element (i.e., g = — 1) is sent to 7 . By (10.72) this group acts on the set 
S^{An) of ground states of Aa^ relative to the dynamics and by (10.73) the 
same is true for the set 5oo(A) of ground states of the corresponding infinite system 
for a (and analogously for jS-KMS states). These sets may be described as follows. 

Theorem 10.11. 1. For any N <°° and B — 0 the ground state of the quantum Ising 
model (10.60) is doubly degenerate and breaks the Z 2 symmetry of the model. 

2. ForN < °° and any B >0 the ground state is unique and hence Z 2 -invariant. 

3. At N = °° with magnetic field Q < B < \, the model has a doubly degenerate 
translation-invariant ground state (0^, which again breaks the Z 2 symmetry. 

4. At N = °° and B>\ the ground state is unique (and hence 'Ij 2 -invariant). 

5. Recall Definition 8.24. For 0 < B < 1 the states (tis in no. 2) with 

®r = K< + ®(7) (10-74) 

form a continuous field of states on the continuous bundle A(^(; in particular, 

jimcd® = 00 ^. (10.75) 

The two ground states in no. 1 and no. 3 are tensor products of | f) and | {), respec¬ 
tively (where ( 73 I f) = | f) and ( 73 I {) = — | {)), so that < 73 ( 0 ) is an order parameter 
in the sense of Definition 10.6. In no. 4, on the other hand, each spin aligns with the 
magnetic field in the x-direction, so that the ground state is an infinite tensor product 
of states I —^), where a\ \ —>■) = | —^), and this time di ( 0 ) is an order parameter. 
Case no. 2 becomes more transparent if we realize the Hilbert space H^i as 
where Sn is the set of all spin configurations sonN sites, that is, 

s:{-iiV,-iiV+l,...,iiV-l}^{-l,l}. 
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In terms of the eigenvectors 11) = 11) and | — 1) = | J,) of ff 3 , and the orthonormal 
basis {ds)seSN (where 5s{t) = 5st), a suitable unitary equivalence 

VN:fiSN)^HN (10.76) 


is given by linear extension of 

VN5s = \si-^^N)---sCjN-l)), sJGSn. (10.77) 

For example, the state |1 ••• 1) corresponds to where S|(x) = 1 for all x, and 
analogously S 4 ,(x) = —1 for the state | — 1 • • • — 1). Using £^{Sn), we may talk of 
localization of states in spin configuration space (similar to localization of wave- 
functions in L^(K")), in the sense that some \j/ € £^{Sn) may be peaked on just a 
few spins configurations. Provided 0 < B < 1 this is indeed the case for the unique 
ground state in case no. 2, which is similar to the ground state of the double-well 
potential discussed in §§10.1-10.2, replacing M by Sn (and It > 0 by 1 /N). 

Theorem 10.11 and related results used below, such as eq. (10.82), follow from 
the exact solution of the model for both N < °° and = oo, to be discussed in 
§§10.6-10.7. This solution is rather involved, but a rough picture of the various 
ground states may already be obtained from a classical approximation in the spirit 
of §8.1. This approximation assumes that the spin-1/2 operators jCr, are replaced 
by their counterparts for spin n • j, upon which one takes the limit n —^ oo. In this 
limit, the spin operators are turned into the corresponding coordinate functions on 
the coadjoint orbit ^ 1/2 C for SU{2), which is the two-sphere with radius 
r = 1 /2. In principle, this should be done for each of the N spins separately, yielding 
a classical Hamiltonian he that is a function on the A^-fold cartesian product of 
with itself. However, if we a priori assume translation invariance of the classical 


ground state, only one such copy remains. Using spherical coordinates 

(x = j sin 9 cos^,y = j sin0 sin^,z = 5 cos 0), (10.78) 

the ensuing trial Hamiltonian becomes just a function on ^ 1 / 2 ^ given by 

/z(0,0) « —(jCos^0+Bsin0cos0). (10.79) 

Minimizing gives cos 0 = 1 and hence y = 0 for any B, upon which 

/z(0) «-(Icos20+Bsin0) (10.80) 

yields the phase portrait of Theorem 10.11 for = 00 , as follows. For 0 < B < 1, 
the global minimum is reached at the two different solutions 9± of cos 9± = B, with 
ensuing spin vectors 

x±(B) = (15,0,±1^1-52), (10.81) 

starting at x±( 0 ) = ( 0 , 0 ,± 5 ) and merging at 5 = 1 to x+(l) = x_(l) = ( 5 , 0 , 0 ). 


This remains the unique ground state for 5 > 1, where all spins align with the field. 
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In the regime 0 < B < 1 with large but finite N, one finds a far-reaching analogy 
between the double-well potential and and the quantum Ising chain, namely: 

• The ground state of (10.60) is doubly peaked in spin configuration space, similar 
to its counterpart for the double-well potential in real configuration space. 

• One has convergence to localized ground states (10.15) - (10.16) for the quantum 
Ising chain and (10.74) - (10.75) for the double well. 

• For the energy difference between the first excited state and the 

ground state one has (10.17) - (10.18) for the double well, and 

(N^oo), (10.82) 

for the quantum Ising chain. Thus both models show exponential decay, i.e. of 
(10.82) in as —>■ oo, and of (10.13) in 1 /fi as —>■ 0. 

It should be mentioned that exponential decay of the energy gap seems a low¬ 
dimensional luxury, which is not really needed for SSB. All that counts is that 
limA ^^.,0 = 0, which guarantees that the first excited state is asymptotically degen¬ 

erate with the ground state, so that appropriate linear combinations like a>^ can be 
formed that converge to the degenerate symmetry-breaking pure (and hence physi¬ 
cal) ground states (or extreme and hence physical KMS states) of the limit system, 
which are localized and stable (as is clear from the double well). The fact that in 
the two models at hand only one excited state participates in this mechanism is due 
to the simple Z 2 symmetry that is being broken; SSB of continuous symmetries re¬ 
quires a large number of low-lying states that are asymptotically degenerate with the 
ground state and hence also with each other—one speaks of a thin energy spectrum). 

The existence of low-lying excited states may be proved abstractly (i.e., in a 
model-independent way), as follows. For N <°°, let be the ground state (as¬ 
sumed unique) of some model defined on A^ C and let (j) be an order parameter 
(cf. Theorem 10.7) with accompanying vector field 0^ = ir* the quan¬ 

tum Ising chain, we take (j) — ai. Then the key assumptions are expressed by 

=0; (10.83) 

{0nVn\^nWn^) > Cl-N^ {N ^ Cl > 0); (10.84) 

||[[0;v,%],^;v]|| <C 2 -N {N^oo,C 2 >Q). (10.85) 

The first states that the ground state is symmetric, the second enforces long-range 
order, as in (10.41), and the third follows from having short-range forces. A simple 
computation then shows that the unit vector satisfies 

( 10 . 86 ) 

Since is orthogonal to by (10.83), the variational principle for eigenvalues 
(note that has discrete spectrum, as dim(//yv^) < 00 ) then gives An < C 2 /{CiN), 
so that An vanishes as N ^ though perhaps not as quickly as (10.82) indicates. 
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10.6 Exact solution of the quantum Ising chain: N <o° 

The solution of the quantum Ising chain is based on a transformation to fermionic 
variables. Let Hhe a Hilbert space and let F (H) be its fermionic Fock space, i.e., 

(10.87) 

where = C, and for A: > 0 the Hilbert space FI^ = is the totally antisym¬ 

metrized Ac-fold tensor product of FI with itself, see also §7.7. Here the projection 
gW ■ ffk ^ j^k defined by linear extension of 

eL^Vi®---®/Ar= A E sgn(p)/p(i)(8 )---(8 )/p(<;), (10.88) 

where ©,(- is the permutation group on k objects, and sgn(p) is -fl / — 1 if p is 
an even/odd permutation. With the (total) Fock space F{H) = we have 

F^{H) = e-F{H), where e = (strongly) is a projection. For f € H we define 

the (unbounded) annihilation operator a{f ) on F{H) by (finite) linear extension of 

a(/)/i = Vk(/,/i)// ®---®fk, (10.89) 

for k>0, with a{f)z = 0 on = C. This gives the adjoint «(/)* = a* (/) as 

fk = s/k+lf®f\®---®fk. (10.90) 

For each f G H, we then define the following operators on (//); 

c(/) = eMf)e-; (10.91) 

c*if) = e-a*{f)e-. (10.92) 

Note that the map / 1 —>■ c(/) is antilinear in /, whereas f ^ a* (/) is linear in /. It 
follows that c*{f) = c{f)*, that each operator c(/) and c{f) on F^(H) is bounded 
with ||c(/)|| = ||c*(/)|| = ll/ll, and the canonical anticommutation relations hold; 

[c{f),c*{g)]+ = {f,g)H-lF^(Hy, (10.93) 

[c(/),c(g)]+ = [c*(/),c*(g)]+ =0. (10.94) 

Thus we may define CAR{H) as the C*-algebra within B(F^{H)) generated by all 
c(/), where f G H. This is called the C*-algebra of canonical anticommutation 
relations over H, which have constructed in its defining representation on F_{H). 
Choosing an orthonormal basis (e,) of H and writing c(e,) = c, etc. clearly yields 

[ci,c*]+ = 5ij-lp_^Hy (10.95) 

[q,c,]+= [c*,c*]+=0. (10.96) 
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If dim(//) = N <°°, then CAR{H) = B{F^ {H)). First, a dimension count yields 

(C") = ®k=oH- = ^ (10.97) 

By Theorem C.90, the C*-algebra CAR(//) acts irreducibly on (//), so that 

CAR(C^) ^M2 a:(C). (10.98) 

This is already nontrivial for A = 1. In that case, /A (C) = C © C = C^, and 


c = (7_ = 


c = (7+ = 


0 0\ 
loj’ 

(10.99) 

01\ 

00 r 

(10.100) 


where a± = j((7i ±/c72)- This realization explicitly shows that 

CAR(C) =M2(C). 


( 10 . 101 ) 


To generalize this to A > 1, we introduce a lattice (or chain) A = {1,..., A}, and 
for each x S A we define operators Cx, c* by the Jordan-Wigner transformation 




fx-l 


0--W = ( n(-^3)(y) 1 •o--(x); 


tx-l 


cl=e = p|(_(y 3 )( 3 ;) . a+{x), 


v>-=i 


( 10 . 102 ) 

(10.103) 


where x > 1, and ci = and c| = (here (7±(x) = j((7i (x) ±/a 2 (x)) etc.). 
These operators satisfy (10.95) - (10.96); the second expression on each line follows 
because the operators commute for different sites y, and 


gTti'a+CT 


-ff3. 


(10.104) 


Furthermore, since 


c*xCx = (7+(x)(7_(x) = (x); (10.105) 

CxC*x = a-{x)a+{x)) = ^ ^ (x), (10.106) 


the inverse of the Jordan-Wigner transformation is given by 

(j_(x) = (10.107) 

(7+(x) = (10.108) 


“Pu^tJC. T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



392 


10 Spontaneous Symmetry Breaking 


We return to the quantum Ising model (10.60) with free boundary conditions, 
where we relabel the sites as {1,... ,A^}, as above, and change to the Hamiltonian 

( N-l N \ 

^(Ji(x)(Ji(x+1)+A^(J3W , (10.109) 

x=l x=l J 

where, in order to avoid notational confusion with the operator Z? in (10. Ill) below, 
we henceforth replace B A. In terms of the unitary operator u = a/ 1/2(12 + /<72) 
on and hence on (g)^C^, we have = h'j^. 

Using (10.102) - (10.103), up to an additive constant XN ■ 1a^ we omit, we find 

= - ^(Ac>^+i(c*-C;t)(c*+l +C;f+l)), (10.110) 

X=\ 

SO we now show how to diagonalize quadratic fermionic Hamiltonians of the type 

N 

% = - E (.^-y^*xCy + \Bxy{clc; - c,Cy)) , (10.111) 

where A and B are real N xN matrices, with A* = A and B* = —B. Indeed, taking 

A = i(5 + 5 *)+A-Iat; (10.112) 

B=^j{S-S*), (10.113) 

recovers (10.110), where S : is the shift operator, defined by 


Sf{x)=f{x+iy, (10.114) 

S*fix)=f{x-l). (10.115) 

By convention, /(A^+ 1) = /(O) = 0 (i.e., Sf{N) = S*f{0) = 0 for any / € C^); 
in terms of the standard basis (Vx) of we have = 0 and SVx = Vx-i for 
X = {2,... ,N}, and likewise S*Vn = 0 and SVx = Vx+\ for x = {1,... ,A^ — 1}. 

The smart thing to do now turns out to be diagonalizing the 2N x 2A^-matrix 


M = 


A B 
-B -A 


(10.116) 


which by a unitary transformation may be brought into the simpler form 


M' = 


aA72-aA72\/A B\( x/lj2 

a/T72 a/T72 ) \-B -a) 



(10.117) 


where C = A+B. For example, for the model (10.111) we simply have 


C = S + X-In. 


(10.118) 
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The equations for the eigenvalues £k and eigenvectors of M', i.e., 


M' 




(10.119) 


where (pit,Wk G equivalent to both the coupled system of equations 


CWk = EkW^ (10.120) 

C*(Pk = £kWk\ (10.121) 

C = A + B, (10.122) 


where the eigenvalues £k real (since M* = M), and to the uncoupled version 


CC>, = e|(p,; (10.123) 

C*CWk = elwk; (10.124) 

CC* =A^-B^-[A,B]-, (10.125) 

C*C =A^-B^ + [A,B]. (10.126) 


Without loss of generality we may (and will) assume that the (p/^, xj/^ are unit vectors 
in C^, so that the corresponding unit vector in is {(p;., V4)/v^). Furthermore, 
since C (or M) is a matrix with real entries and the e*. are real, by a suitable choice 
of phase we may (and will) also arrange that (pk, V4 have real components. Finally, 
it follows from (10.120) - (10.120) that {—<Pk, V4) is an eigenvector of C with eigen¬ 
value —Ek, SO that the unitary transformation U' that diagonalizes M', i.e.. 


((/r'«v=(f»), 


where E = diag(ei,..., e^), takes the form 


U' = 


1 


(p -(p\ 

¥ ¥ J 


(10.127) 


(10.128) 


where (p is the N x N matrix (^i,..., (p^), seeing each vector (pi as a column, etc. 
Combined with (10.117), we obtain 


U = 


1 

2 



U-^MU = 



((P -(p\ ^ 1 ( ¥+(p ¥-(p\ 

\¥ ¥ J ^\¥-(p ¥+(pJ 



(10.129) 

(10.130) 


where we introduced N xN matrices 


M=l(V/ + ^); (10.131) 

V=i(v/-^). (10.132) 
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Using orthonormality and completeness of both the {(pk) and the (V4), one obtains 


u*u + v*v 

= 7 ; 

(10.133) 

u*v + v*u 

= 0 ; 

(10.134) 

uu* + vv* 

= 7 ; 

(10.135) 

uv* + vu* 

= 0 . 

(10.136) 


Of course, u and v are far from unique, as they depend on both the ordering and 
the phases of the vectors (pi^ and 1 / 4 . In partial remedy of the former ambiguity we 
assume that 0 < £0 < Ci < • • • < (which can be arranged by a suitable ordering as 
well as choice of sign of the eigenvectors (pi^). Towards the latter, we already agreed 
that both the (p^ and V 4 are real, so that also our matrices u and v have real entries. 
We now explain the purpose of diagonalizing M in (10.116) using u and v. 

Proposition 10.12. Let u and v be operators on a Hilbert space H, where u is linear 
and V is anti-linear. Let c{ f) and c* (/) be the operators (10.91) - (10.92), satisfying 
the CAR (10.93) - (10.94). Define the Bogoliubov transformation 

77 (/) =c(u/)+c*(v/); (10.137) 

T7*(/) =c*(u/)+c(v/), (10.138) 

which extends to a linear map a : CAR(//) —^ CAR(//), where 77 (/) = a(c(/)) etc. 
Then a is a homomorphism of C*-algebras, or, equivalently, one has the CAR 

[n{f).n*{g)]+ = {f,8)H-\H\ (10.139) 

{riif),riig)]+ = [ri*{f),ri*ig)]+=0, (10.140) 

iff u and v satisfy (10.133) - (10.134), with u u,v v. Moreover, a is invertible 
(and hence defines an automorphism ofCAK{H)) iff in addition (10.135) - (10.136) 
are valid (again with with m u, v v), in which case the inverse is 

c(/) = J?(u7) + J7*(v7); (10.141) 

c*if) = ri*{u*f) + ri{v*f). (10.142) 

Note that anti-linearity of v is needed to make / n- 77 (/) anti-linear, like / 1 —>■ c(/). 
With respect to a base (e, ) of H, the transformations (10.137) - (10.142) reads 



j 

(10.143) 


= +77); 

j 

(10.144) 

Ci 

— E(“0^7 +7^,7 )’ 

j 

(10.145) 

c* 

= E(7’7;+77)- 

(10.146) 


j 
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Proof. The proof is a straightforward computation. □ 

In comparison with the preceding diagonalization process, where H = C^, we 
notice that in this process u and v were both linear, whereas in Proposition 10.12 u 
is linear whereas v is antilinear. This difference is easily overcome by taking u — u 
and V = Jv, where J : —>■ is the anti-linear map Jf{x) = f{x), so that 7 is a 

conjugation in being an anti-linear map that satisfies J* = = J. 

Returning to our generic Hamiltonian (10.111), a straightforward computation 
using (10.145) - (10.146), (10.116), (10.129), and (10.133) - (10.136) yields 

N 

% = E (10.147) 

k=\ 

up to a (computable) constant, where we recall that ek>Q(k=l,... ,N). Note that 
is still defined on the fermionic Fock space /T(C^), as is a (complicated) 
quadratic expression in the operators c, and c* on The point is that (as a 

consequence of Proposition 10.12) the and 77 ^ also satisfy the CAR, i.e., 

[m,n*]+ = 5ij-\p_(H)\ (10.148) 

[rii,rij]+ = [ri*,ri*]+ = 0. (10.149) 

Theorem 10.13. Let A = CAR(C^) be the CAR-algebra over H = with dynam¬ 
ics at{a) = e“^^ae^“''^ given by (10.111) and hence l7y(10.147). Then a has a 
unique (and hence pure and symmetric) ground state COo, specified by the property 

7r^(77(/))f2«o=0 (/gO. (10.150) 

Proof Recall that a defines a derivation 5 ; CAR(C^) —7 CAR(C^) defined by 
(9.54), which in the case at hand is simply by 5{a) = i[h!^,a] (since A is finite¬ 
dimensional, 5 is bounded and hence defined everywhere). Using the identity 

[ab,c] = a[b,c]+— [c,a]+b, (10.151) 

as well as the relations (10.148) - (10.149), we obtain 5{rjk) = —i^k'nk, and hence 

-/coo(t7^*5(77Ar)) = -cooiritrik). (10.152) 

The condition —i(X)o{a*5(a)) > 0, i.e., eq. (9.56) from Proposition (9.20), there¬ 
fore implies that (X)Q{rj^rjk) < 0, and hence (Oo{rjlrjk) = 0 by positivity of cOq. Since 
Fo(H) is finite-dimensional and A = B(Fo(H)), cf. (10.98), we may assume ground 
state(s) to be pure and normal, i.e., there is some unit vector \j/o G F^(H) with 
Co(a) = {xifQ^axifo) for each a €A. Hence (WOi'nk^kWo) = 0, which enforces 

rikWo = 0 {k=l,...,N). (10.153) 

This property makes \j/o unique up to a phase. Indeed, together with (10.148) - 
(10.149), eq. (10.153) implies the values of all one- and two-point functions, i.e.. 
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c%(77(/)) = a%(77*(/))=0; (10.154) 

®o(i7*(/)J7fe)) = o)Q{ri*{f)ri*{g))^coo{ri{f)ri{g))=0; (10.155) 

cooiriifWig)) = {f,g)H. (10.156) 

Furthermore, the value of Wq on any product of an odd number of rj{f) and ri*(g) 
vanishes; for an even number the value Ct)o(n”=i ^(fi) n/=i V*(gj)) it is given by 

£(-l)"^P(Oo(r](fi)il*(gp)coo(fl7](fi) n r]*(gjn. 

P=l V'=2 j=^J^P ) 

Hence (10.153) gives cOo on all of CAR(C^). Since CAR(//) =B(/A(//)), this fixes 
\j/o up to a phase. Eq. (10.150) is just a fancy way of rewriting (10.153). □ 

By construction, the ground state energy of (10.147) is zero. In connection with 
our approach to SSB via Butterfield’s Principle it is of interest to compute the energy 
El of the first excited state. This maybe done from (10.120) - (10.121) with (10.122) 
and the specific expression (10.118) for the quantum Ising chain. Thus we solve 

X\i/k{x) + \i/k{x+i) = ek(Pk{x) (x= 1,...,A,V4(A^+1) =0); (10.157) 


X(pk{x) + (pkix-i) = Ekij/kix) {x=l,...,N,(pk{0)=0). (10.158) 

A solution of this system (with real wave-functions and positive energy) is given by 

(Pkix) = C{-lfiim{qk{x-N- 1)); (10.159) 

Wk{x) = —Csin(^<.x); (10.160) 

Ek = ^JlxX^ + 2Xcoii{qk), (10.161) 

where C > 0 is a normalization constant, and qk should be solved from 

{N +\)qk = {k— l);r + arctan ( -—— ) . (10.162) 

\X+cosqk J 


For example, for A = 0 (i.e. no transverse magnetic field) we obtain qk = kn/N, 
where k= For X > 1 there is a unique real solution qk for each k, too, and 

even as N ^ there is an energy gap Ek > 0 for each k. For 0 < A < 1, however, 
there is a complex solution qi = n + ip, where p G K is a solution to 

tanh((A^+ l)p) = . (10.163) 

coshp — A 

As A —>■ oo, we find p = —ln(A) — (1— X^)X^^^^^\ Eq. (10.161) then gives 

e(^i)«(l-A2)A^ (A^oo), (10.164) 

which, recalling that = Ei and = 0 and hence An = Ei, confirms (10.82). 
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10.7 Exact solution of the quantum Ising chain: N = o° 

The (two-sided) infinite quantum Ising chain is described by the C*-algebra 

T’ = CAR(£2(Z)); (10.165) 

one may also consider a one-sided chain, but it lacks translation symmetry. By the 
construction at the beginning of the previous section, F is isomorphic to the infinite 
tensor product A = M 2 (C)“’. We consider F to be generated by the operators cj 
(x G Z), where c“ = Cx and c+ = c*. In this notation, the CAR (10.95) - (10.96) read 

[c^,cf]+ = 5xy; (10.166) 

[c±,c±]+=0. (10.167) 

Although the local Hamiltonians (10.111) do not have a limit as N ^ as ex¬ 
plained in §10.5 they do generate a time-evolution on F in the sense of a continuous 
homomorphism a : M —> Aut(F) via (10.65) and (10.67); see also Theorem 9.15. 
Let us first extend the approach in the previous section to N = °°, in which case 
is replaced hy H = f^(Z), assuming the theory has already been brought into 
fermionic form with local Hamiltonians (10.111) (as we will see, it is this step, 
i.e., the Jordan-Wigner transformation, that marks the difference between N < °° 
and N = °o). Thus we define operators A : f^(Z) —>• f^(Z) and B : f^(Z) —>• f^(Z) 
as the obvious extensions of the N x N matrices A and B to operators on ^^(Z), 
and similarly S : f^(Z) —>■ £^(Z) is the “full” shift operator, defined by {Sf){x) = 
f{xF\). Instead of the somewhat clumsy explicit solution procedure sketched in the 
previous section for N <°°, we may now simply rely on the Fourier transformation 


: f^(Z) {[-%,%])■, 

(10.168) 

{^f){k)^f{k) = 

(10.169) 

xGZ 


pK fjh 

i^-^f){x)^f{x) = J_^^e‘^^f{k), 

(10.170) 


which diagonalizes A and B to operators A,B : L^([— tt, Tt]) — >■ L^([— TT, 7r]). For the 
quantum Ising Hamiltonian (10.110) these are given by the multiplication operators 

A\^{k) = — {cos k +X)\lf{k); 

B\^{k) = —isink'^{k). 

For fixed k, the eigenvalues and eigenvectors of the 2x2 matrix 

/ — {cosk + X) —/sink \ 

^ /sink cosk-l-A J ’ 


(10.171) 

(10.172) 


(10.173) 
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are ±ek, given by (10.161) with qt k. It is then routine to find a unitary 2x2 ma¬ 
trix Uk = [ \ that diagonalizes A4 in the sense that = 

\Vk Uk J 

Fourier transforming these multiplication operators back to f^(Z) then yields an op¬ 
erator U on f^(Z) ©f^(Z) that satisfies (10.129). This yields a unique ground state 
COo characterized by a property like (10.150) or (10.153), where 

dk ^ 

^:^f{k){ukCk + Vkc\)\ 

Ck=Y^ 

jez 

jez 

In summary, one-dimensional fermionic models with quadratic Hamiltonians like 
(10.111) have a unique ground state even at = oo. Thus one wonders where SSB in 
the quantum Ising chain could possibly come from. We will answer this question. 

Almost every argument to follow relies on Z 2 -symmetry. In general, a Z 2 -action 
on a C*-algebra A corresponds to an automorphism 9 : A ^ A such that 9^ = id^i, 
i.e. 9 represents the nontrivial element of Z 2 . For example, define 9 : F F hy 

0(c±) = -c± (;GZ), (10.177) 

which is an example of a Bogoliubov transformation (cf. Proposition 10.12) and 
hence extends to an automorphism of F (which implies that 0(lf) = If). Clearly, 
9^ = idf, and in addition each local Hamiltonian (10.111) is invariant under 0; by 
implication, so is the dynamics a, i.e., a, o 0 0 o a, for all t G M. 

A C*-algebra A carrying a Z 2 -action decomposes as 

A=A+©A_; (10.178) 

A± = {aGA I 0(fl) =±fl}, (10.179) 

where the even part A+ is a subalgebra of A, whereas the odd part A_ is not: one 
has ab G A+ for a,b both in either A+ or A_, and ab G A_ if one is in A+ and the 
other in A_. For example, if A = B{H) for some Hilbert space H and w : H H is 
a untitary operator satisfying = 1 (and hence w* = w), then 

9 (a) = waw* (= waw) (10.180) 

defines a Z 2 -action on A. In that case, A+ and A_ consist of all a G A that commute 
and anticommute with w, respectively, that is, 

A± = {a G A I aw©wa = 0}. (10.181) 

In case of (10.165) with (10.177), the subspace /© (FL) is just the linear span of all 
products of an even (odd) number of c^’s. 


(10.174) 

(10.175) 

(10.176) 
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Let us move to Theorem C.90 and reconsider the proof of the claim that if 
7ta){Ay ^ C - 1, then (O is mixed. If the commutant 7t(o{Ay is nontrivial, then it 
contains a nontrivial projection e+ G TTaiA)'. It then follows that e+i2a) ^ 0; for 
if e+f2ft) = 0, then ae+i2o) = e+aQm = 0 for all a G A, so that e+ = 0, since Ka, is 
cyclic. Similarly, C-Qm 0 with = 1// — e+, so we may define the unit vectors 

f2±=e±f2<a/||e±f2«||, (10.182) 

and the associated states (0±{a) = (f2±,;rft)(a)f2±)on A. This yields a convex de¬ 
composition CO = X(0+ -I- (1 — A)a)_, with X = ||f2_ |p. Since A 0,1 and 0 )+ 7 ^ a)_, 
it follows that (O is mixed. The associated reduction is effected by writing 


H = H+®H_; (10.183) 

H± = e±H, (10.184) 

in that A (more precisely, nco{A)) maps each subspace H± into itself. Now pass from 
the projections e± to the operator w = e+ — e_, which by construction satisfies 

w*=w-^=w. (10.185) 

In particular, w is unitary. Conversely, if some unitary w satisfies = 1//, then 

e± = i(l//±w) (10.186) 

are projections satisfying e++e^ = 1 h, giving rise to the decomposition (10.184). 
Group-theoretically, this means that one has a unitary Z 2 -action on H = Ha, in 
which the nontrivial element of Z 2 = {—1,1} is represented by w. The decompo¬ 
sition (10.184) then simply means that Z 2 acts trivially on //+ (in that both group 
elements are represented by the unit operator) and acts nontrivially on H (in that 
the nontrivial element is represented by minus the unit operator). In conclusion, one 
has a Z 2 perspective on the reduction of Ha, and instead of a projection e G TtaiA)' 
one may equivalently look for an operator w G TtaiA)' that satisfies (10.185). 

Proposition 10.14. Suppose A carries a 'Ij 2 -action 9 and consider a state O) ; A —C 
that is 1j2-invariant in the sense that <o(9(a)) = CO (a) for all a G A. We write this 
as 9*CO — CO, with 9*CO — CO o 9. Then there is a unitary operator w : Ha -G Ha 
satisfying = Ih, wQ. — Q., and and w%a{ci)w* = %a{9{a)) for each a GA. 

Cf. Corollary 9.12. In this situation, we obtain a decomposition of H = Ha accord¬ 
ing to (10.183), where the projections e± are given by (10.186), so that, equivalently, 

H± = {xj/G H \w\l/= ±\l/} =A±Q_. (10.187) 

In terms of the decomposition (10.178), it is easily seen that each subspace H± 
is stable under A+, whereas A_ maps H± into H^,. We denote the restriction of 
na{A+) to H± by n±, so that a Z 2 -invariant state 0 on A not just gives rise to the 
GNS-representation Ka of A on Ha, but also induces two representations n± of the 
even part A+ on H±. This leads to a refinement of Theorem C.90: 
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Theorem 10.15. Suppose A carries a 'Ij 2 -action 9, and let CO : A ^ C be a Z 2 - 
invariant state. With the above notation, suppose the representation %^{A^) on 
is irreducible. Then also the representation ;r_(A+) on H is irreducible, and there 
are the following two possibilities for the representation TlmiA) on H = //+ ©//_; 

1. %(o{A) is irreducible (and CO is pure) iff%+{^+) and 7r_(A+) are inequivalent. 

2. %(o{A) is reducible (and CO is mixed) iff%+{A+) and ;r_(A+) are equivalent. 

Proof. The proof of this theorem is much more difficult than one would expect 
(given its simple statement), so we restrict ourselves to the easy steps, as well as to 
two examples illustrating each of the two possibilities. To start with the latter: 

1. A = M 2 (C), with 9 (a) — GjaGy, note that ( 7 | = 1 and (J 3 = Gj. Then 

A+ = |(^^+^®),z±gc|=D 2 (C); (10.188) 

A-= {(® q'),zi,Z 2 Gc}, (10.189) 

where Dn{C) denotes the C*-algebra of diagonal n x n matrices. Take = (1,0), 
with associated state 

co{a) = {Tl,aQ.), (10.190) 

where a G M 2 (C). It follows from §2.4 that the associated GNS-representation 
71(o{A) is just (equivalent to) the defining representation of M 2 (C) on Ha = C^, 
in which the cyclic vector Qa of the GNS-construction is Q. itself. Since 03 X 2 = 
Q, the state defined by (10.190) is Z 2 -invariant, and the unitary operator w in 
Proposition 10.14 is simply w = G 3 . Hence the decomposition (10.183) of H = 
is simply = C©C, i.e., 

//+= {(z,0),zGC}; (10.191) 

//_ = {(0,z),zGC}. (10.192) 

Of course, we then have H± =A±T2. Identifying H± = C, this gives the one¬ 
dimensional representations 7 r±(Z) 2 (C)) as 

) =z±, (10.193) 

which are trivially inequivalent. Hence by Theorem 10.15 the defining represen¬ 
tation of M 2 (C) on is irreducible, as it should be. 

2. A = D 2 (C), with 

0(diag(z+,z-)) =diag(z_,z+), (10.194) 

where we have denoted the matrix in (10.188) by diag(z+,z_). This time, 

A± = {diag(z,±z),zGC}. (10.195) 

We once again define a Z 2 -invariant state co by (10.190), but this time we take 
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Q = 


1 



(10.196) 


Hence 

//± = {(z,±z),zgC}.. (10.197) 

We may now identify each A± with C under the map diag(z, ±z) i—>■ z from 
A± to C. Similarly, we identify each each subspace H± with C under the map 
H± C defined by (z, ±z) i—z. Under these identifications, we have two one¬ 
dimensional representations n± of the C*-algebra C on the Hilbert space C, given 
by n± (z) = z. Clearly, these are equivalent: they are even identical. Hence by The¬ 
orem 10.15 the defining representation of D 2 {C) on is reducible, as it should 
be: the explicit decomposition of in D 2 (C)-invariant subspaces is just the one 
(10.191)- (10.192) of the previous example. 


The first-numbered claim of Theorem 10.15 is relatively easy to prove from The¬ 
orem C.90. Suppose n±{A+) are inequivalent and take b G na{A)'\ we want to show 
that b = X- \ for some A G C. Relative to H = we write 


b = 


[b++ b+\ 

U-+ 


(10.198) 


where the four operators in this matrix act as follows: 


b++ : H+ H+,b+^ : H H+, b-+ : ^ ^ (10.199) 

Since A+ C A, we also have b G na^A^-)'. The condition = 0 for each a G A+ 


is equivalent to the four conditions 

[b++,7t+{a)]=0; (10.200) 

[b—,7t-{a)]=0; (10.201) 

n+{a)b+-= b+^n-{a)', (10.202) 

n-{a)b-+= b-+7:+{a). (10.203) 


We now use the fact (which we state without proof) that, as in group theory, the 
irreducibility and inequivalence of 7r±(A+) implies that there can be no nonzero 
operator c : H such that c7r+(fl) = n-{a)c for all a G A+, and vice versa. 

Hence b+- = 0 as well as b _^ = 0. In addition, the irreducibility of 7r±(A+) implies 

that b++ = A+ • Ih+ and b _= A_ • l/Zj. Finally, the property \b,a\ = 0 for each 

a G A_ implies X+—X-. Hence b — X- \, and n(o{A) is irreducible. 

To prove the second-numbered claim of Theorem 10.15, let 7r+(A+) = 7r_(A+), 
so by definition (of equivalence) there is a unitary operator v : H H+ such that 

v;r_(fl) = 7r+(a)v,Va G A+. (10.204) 

Extend v to an operator w : H Hhy 
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w = 


f 0 V 

\v* 0 


(10.205) 


It is easy to verify from (10.204) that [w, 7r(fl)] = 0 for each a G A+. To check 
that the same is true for each a G A_, one needs the difficult analytical fact that 
w is a (weak) limit of operators of the kind 7r(fl„), where G A_, which also im¬ 
plies that w*7t{a) G ;r(A+)". Since ;r(A+)"' = 7r(A+)' and w G 7r(A+)', we obtain 
\w* n{a),w\ = 0 for each a G A_. But for unitary operators w this is the same as 
[w, 7r(fl)] = 0. So w G 7t(A)', and hence 7t{A) is reducible by Theorem C.90. □ 

In determining the ground state(s) of the quantum Ising chain, we will apply The¬ 
orem 10.15 to the C*-algebra (10.87). This application relies on the representation 
theory of F. For the moment we leave the Hilbert space H general, equipped though 
with a conjugation J : H ^ H. It turns out to be convenient to use the self-dual 
formulation of the CAR, which treats c and c* on an equal footing. Define 




(10.206) 


whose elements are written as h = {f,g) or h = f+g, with inner product 

{hl,h2)K = {fl,f2)H + {gUg2)H- (10.207) 

We then introduce a new operator in CAR(H), namely the field 

cP{h)=c*{f)+ciJg), (10.208) 


which is linear 'tt\h = f+g, because the antilinearity of c{f ) in / is canceled by the 
antilinearity of J. This yields the anti-commutation relations 


[0*(/ii),0(/i2)]+ = {huh2)K, (10.209) 

but be aware that generally [ 0 *(/zi),^>*(/z 2 )]+ and [0(/zi), 0 (/z 2 )]+ do not vanish. 
Indeed, in terms of the antilinear operator F : K ^ K, defined by 


r = 



( 10 . 210 ) 


we have the following expression for the adjoint <P{h)* = <P*{h): 

d>*{h) = d>{rh). (10.211) 

If we identify f G H with /+0 G K, we may reconstruct c and c* from <l> through 

c*{f) = d2(f)-, (10.212) 

c(/) = 0(r/). (10.213) 


Bogoliubov transformations now take an extremely elegant form. For any unitary 
operator S on K that satisfies [5,F] = 0, we define the transform 4>s of <P by 
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0s{h) = 0{Sh), (10.214) 

with associated creation- and annihilation operators (where H B f = f+Q, as above) 

c*s{f) = ^s(/); (10.215) 

csif) = (10.216) 


To see the equivalence with the original formulation of the Bogoliubov transforma¬ 
tion, note that for unitary S, the condition [5, F] = 0 is equivalent to the structure 

where u : H ^ H is linear, v ://—>// is antilinear, and u and v satisfy (10.133) - 
(10.134). Moreover, from (10.137) - (10.138) we obtain 

cs{f) = riif); (10.218) 

Csif) = n*if)- (10.219) 

An interesting class of pure states on CAR(//) arises as follows. 

Theorem 10.16. There is a bijective correspondence between: 

• Projections e : K ^ K that (apart form the properties e^ = e* = e) satisfy 


FeF = lK — e; 


( 10 . 220 ) 


• States COe on F that satisfy 

0)ei<P{h)*<P{h)) = {h,eh) yh G K. (10.221) 

Such a state COg is automatically pure (so that the corresponding GNS-representation 
Tte is irreducible), and is explicitly given by 

®,(0(/ii)...0(/i2„+i)) = 0; (10.222) 

/ n 

COei^ihi)---<Pih 2 n)) = ^ sgn{p) Yl{eh^gni 2 j),rh,g„^ 2 j-l)) 1^0.223) 
pe&2n 7=1 

the sum E' is over all permutations p of I,...,2n such that 

p{2j-l) <pi2jy, (10.224) 

p(l) <p(3)<...<p(2n-l). (10.225) 

We omit the proof. Note that (10.221) is a special case of (10.223), because of 
(10.211). States like (Og, which are determined by their two-point functions, are 
called quasi-free', the ground state (Oq on CAR(C^) constructed in the previous 
section is an example (one also has mixed quasi-free states, e.g. certain KMS states). 
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As a warm-up, we reconstruct the ground state of the free fermionic Hamiltonian 
on F using the above formalism. That is, we assume that % in (10.111) reads 

At/2-l 

^ e,clc,, (10.226) 

x=-N/2 

initially defining dynamics on = CAR(C^). In that case, the projection eo onto 
the second copy of N = in K, i.e. 


(10.227) 

reproduces the ground state COo(a) = (0|fl|0), where |0) is the vector 1 G C in /A (//), 
such that c(/)|0) = 0 for all f G H. This also works for N — i.e., we construct 

dynamics on CAR(f^(Z)) from the local Hamiltonians (10.226) as indicated at the 
beginning of this section, and use the same formula for eo, this time with FI = £^(Z). 
In the more general case (10.111), we replace eo in (10.227) by 

ef^=SeoS-\ (10.228) 

where S is given by (10.217), in which foiN<°° the operators u and v were con¬ 
structed in (10.131) - (10.132). This time, the associated state (O (s) = (Os is the state 

‘^0 

called 0)0 in Theorem 10.13. As explained at the beginning of this section, this pro¬ 
cedure even works for N = °° and hence H = f^(Z). 

Having understood fermionic models with quadratic Hamiltonians, what remains 
to be done now is to reformulate the original quantum Ising chain, defined in terms 
of the local spin matrices Oi (x), in terms of the fermionic variables Cx and c*. For fi¬ 
nite N this was done through the Jordan-Wigner transformation (10.102) - (10.103). 
This time we need a similar isomorphism between A and F, where 



A = ®jezM2(C); (10.229) 

F = CAR(f2(Z)), (10.230) 

and hence we would need to start the sums in the right-hand side of (10.102) - 
(10.103) at j = —oo. At first sight this appears to be impossible, though, because 
operators like exp(;r!XyZi„ (7+(y)(7_ (y)) do not lie in A (whose elements have in¬ 
finite tails of 2 X 2 unit matrices). Fortunately, this problem can be solved by adding 
a formal operator T to A, which plays the role of the “tail” 

“T = W)”. (10.231) 

This formal expression (to be used only heuristically) suggests the relations: 
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= 1; (10.232) 

T* = T; (10.233) 

TaT^d-{a), (10.234) 

where 0_ : A -4 A is a Z 2 -action defined by (algebraic) extension of 

0_((J±(y)) = -(J±(y) (y < 0); (10.235) 

0_((J±(y)) = (J±(y) (y > 0); (10.236) 

0_((J3(y)) = (J 3 (y) (yeZ); (10.237) 

0_(ao(y) = (Jo(y) (yez), (10.238) 

where ao = l 2 - Formally, define an algebra extension 

A=A©A-r, (10.239) 


with elements of the type a + bT, a,b gA, and algebraic relations given by (10.232) 
- (10.233). That is, we have 

(a + bT)* = a*+ 9-{b*)T; (10.240) 

{a + bT)-{a' + b'T) = aa' + be^{b') + {ab' + be^{a'))T. (10.241) 

Within A, the correct version of (10.102) - (10.103) may now be written down as 

c± = (x<l); (10.242) 

c± = Ta^; (10.243) 

c± = (x>l), (10.244) 

with formal inverse transformation given by 

(7±(x) = cf (x < 1); (10.245) 

a±(x) = Tcf; (10.246) 

(7±(x) = (x>l), (10.247) 

where this time we regard T as an element of the extended fermionic algebra 

F = F®F-T, (10.248) 

satisfying the same rules (10.232) - (10.234), but now in terms of a “fermionic” Z 2 - 
action 9y : F ^ F given by extending the following action on elementary operators: 

0_(c±) =-c±(y<0); (10.249) 

0_(c±) = c±(y>0). (10.250) 

(10.251) 
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Because of T, the Jordan-Wigner transformation does not give an isomorphism 
A=F, but it does give an isomorphism A = F. More importantly, if, having already 
defined the Z 2 -action 0 on F by (10.177), we define a similar Z 2 -action on A by 

0((j±(y)) = -a±{y) (y G Z); (10.252) 

9(c73(y)) = cJ 3 (y) (y€Z); (10.253) 

9(ao(y)) = ao(y) (y€Z), (10.254) 

and decompose A = A+©A_ andF = F+©F_, according to this action, cf. (10.178), 
we have isomorphisms 


A+ ^ F+; (10.255) 

A_^F^T; (10.256) 

A^F+eF^T. (10.257) 

For given dynamics (10.111), suppose (Oq is a Z 2 -invariant ground state on A. Then 
(Oq also defines a Z 2 -invariant ground state (Oq on F by (10.255) and (Oq (/) = 0 for 
all f G F^. Conversely, a Z 2 -invariant ground state (Oq on F defines a state cOq on A 
by (10.255) and Wq (a) = 0 for all a G A_. But F has a unique ground state, so: 

• Either (Oq is pure on A, in which case it is the unique ground state on A; 

• Or 0)0 is mixed on A, in which case (Oo = + (Oq), where a)(^ are pure but 

transform under the above Z 2 -action 9 as 0 )^ o9 — CO^- 

Theorem 10.15 gives a representation-theoretical criterion deciding between these 
possibilities, but to apply it we need some information on the restriction of Z 2 - 
invariant quasi-free pure states on F to its even part /©. The abstract setting involves 
a Z 2 -action W on K that commutes with F (so that W is unitary, = 1, and 
[r,lT] = 0), which induces a Z 2 -action 0 on E by linear and algebraic extension 
of 9{‘P(h)) = <P{Wh). A quasi-free state (Oe, defined according to Theorem 10.16 
by a projection e : K K that satisfies (10.220), is then Z 2 -invariant iff [IT, e] = 0. 

In our case, this simplifies to 9{‘P{h)) = —<P{h), so that W = —1, and every 
projection commutes with W. In any case, with considerable effort one can prove: 

Lemma 10.17. Given some Z 2 -action W on K, as well as a projection e : K ^ K 
satisfying (10.220), such that [1T,F] = [W,e] = 0: 

1. The quasi-free state COg of Theorem 19.16 is Z 2 -invariant (i.e., 0}eo9 — (Og); 

2. The corresponding GNS-representation space Hg = Ho^for F = F+ ©F_ decom¬ 
poses as Hg = Hf (BHf, with = F±Qg. Each subspace is stable under 
ng{F+), and the restriction ofK{F+) to is irreducible. 

Theorem 10.15 then leads to a lemma, which also summarizes the discussion so far. 

Lemma 10.18. 1. For given Z 2 -invariant dynamics, let (O^ be the (unique, Z 2 - 
invariant) ground state on F = Fq. ©F_. Under F+ C F the associated GNS- 
representation space Hq decomposes as Hq = (BHq , with — T±f2o, and 
we denote the restriction o/7ro(Ff) to by TT^. Then 7r^(F+) are irreducible. 
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2. Regard (Oq also as a state (0^ on F^(BF^T by putting (Oq (a) = Ofor all a GF^T, 
and similarly as a state (Oq on A by invoking (10.255) and putting (0^{a) = Ofor 
all a G A_. Let Hq = (BHf be the GNS-representation space of Fj^ (BF^T 
defined by (0^, where Hf_ = F+Q. and Hf = F TQ. Here and Hf are stable 
under Fj^-; we denote the restriction ofF+ to by n!y, so that 

a. Then (Oq is a ground state on A. Any ’L 2 -invariant ground state on A arises in 
this way (via F), so that there is a unique 'Ij 2 -invariant ground state on A. 

b. The state (Oq is pure on A iff the irreducible representations 7z'^{F+) (or 
TTq (^+)) and 7rf(F-Q) are inequivalent. 

It turns out to be difficult to directly check the (in)equivalence of 7 rJ(F+). For¬ 
tunately, we can circumvent this problem by passing to yet another (irreducible) 
representation of F^. We first enlarge F to a new algebra 

F = F®FT =F+®F^®F+T®F^T, (10.258) 

and extend the state (Oq on F to a state (Ho on F by putting d)o{FT) =0, so that a)o 
is nonzero only on F+ C F. Let no be the associated GNS-representation of F on the 
Hilbert space Hq = FQ. Under n(F+) this space decomposes as 


Ho =F+Qo® FLf2o © F+ TQq (BF^TQq, (10.259) 

with corresponding restrictions 7f±(/©) and nJ({F^)-, more precisely, n± is the re¬ 
striction of 7f(F+) to F±f2o, whilst nj. is is the restriction of to F±TQo. 

Clearly, n±{F+) is the same as ;r^(F'+), and nf{F+) is just our earlier nf{F+), but 
n'^(F+) is new. To understand the latter, we rewrite (10.259) as 

Ho=Ho®H^-, (10.260) 

Ho = F+QQ®F^Qo^F^®FGn()-, (10.261) 

H^ = F+ TQo ®F^Tao, ( 10.262) 

the point being that n{F) evidently restricts to both Hq and Hq . We know the action 
of n{F) on Ho quite well; it is the representation induced by the ground state chq. As 
to Hq , we define a state Oq on F by 

(blia) = {n{T)FloMa)Tt{T)Qo)H^ = {Flo,Tt{Q-{a))^o) (10.263) 

where the second equality follows from (10.234). Comparing Ho and Ho, for all 
b GF (and hence especially for b = 9^ (a)) we simply have 

{Qo,n{b)Qo}H^ = &Q{b) = (o!o (b), (10.264) 

so that Oq = (Oq oQ_ = 91 (Oq . Decomposing the GNS-representation space Hq,^f 

of na*,,F{F) as Ha*,,F — Hf p (BHG p, it follows that nf(F+) is the restriction 

h_(Oq V t o_coq 07 ®o +' ^ 
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of to Therefore, the representation it{F) restricted to Hq is the 

GNS-representation nQt^{F), so that in turn fc'^{F+) is Kq,^f{F+), restricted to 
^e*aiQ - Hence, further to (10.260) - (10.262), we obtain the decomposition 

nF)-n^^{F)(Bnst^^{F). (10.265) 

The point is that for the quantum Ising chain Hamiltonian (10.110), we have; 

Lemma 10.19. 7. For each X ^ ±1, we have 7t^F{F) = 

2. If this holds, then the representations 7r^(7ff) = 7t^f{F+) and kL{F+) are in- 

®o 

equivalent iff the representations K'^p(Fff) and (T+) are equivalent. 

3. For each X f ztl, the ground state COq is pure on A iff the representations 

Ttf^riF^) and (7^+) are equivalent. 

The first claim follows from Theorem 10.20 below. The third follows from Lemma 
10.18 and the previous claims. The second claim is proved by repeatedly applying 
Theorem 10.15 to Tt{F). Given this lemma, the real issue now lies in comparing n^F 
and Ttgt^F, both as representations of F (as they are defined) and as representations 
of 7^+ C F. This can be settled in great generality by first looking at Theorem 10.16, 
and thence, recalling the positive-energy projection (10.228), realizing that 

H = (10.266) 

= V4^V_- (10.267) 

Here : K K is the Z 2 -action on K defining the Z 2 -action 0_ on F as ex¬ 
plained above Lemma 10.17; specifically, is the direct sum of two copies of 
w- : f(Z) f(Z), defined by (fj) = fj (j > 0) and (fj) = -fj (j < 0). 
Subsequently, without proof we invoke a basic result on the CAR-algebra: 

Theorem 10.20. Let e and e' be projections on K that satisfy (10.220). Then: 

1. ne{F)^np{F)iffe-e'€B 2 {K); 

2. Ttf {F+) = 7t^{F+) iffe — e' G B 2 {K) and dim(e7Gfl (1 — e')K) is even. 

If the first condition is satisfied, the dimension in the second part is finite, so that 
one may indeed say it is even or odd. From Lemmas 10.18 and 10.19 and Theorem 
10.20, we finally obtain the phase structure of the infinite quantum Ising chain; 

Theorem 10.21. The unique Z 2 -invariant ground state COq of the Hamiltonian (10.110) 
is pure (and hence forms the unique ground state) iff both of the following hold: 

- W-e^pW- G B2{K)-, (10.268) 

dim(eQ'^(7G fl (1 — W-e^Q^W-)K) is even. (10.269) 

This is true for all X with |A| > 1. If\X\ < 1, then Wo = 5(®4 T^cT)' '^here 0)^ are 
pure and transform under the Z 2 -action 9 as (0)l^ o9 — COq. 
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We are now going to study SSB in so-called mean-field theories: these are quantum 
spin systems with Hamiltonians like the Curie-Weiss-model for ferromagnetism: 

E (T3W(T3(y) E W’ (10.270) 

a:gA 

where J > 0 scales the spin-spin coupling, and B is an external magnetic field. 
Similar to the quantum Ising model, (10.270) has a Z 2 -symmetry i— 

(cTi, — 02 , —O 3 ), which at each site x is implemented by u{x) = Oi (x). This model 
differs from its short-range counterpart (9.42), i.e, the quantum Ising model, or the 
Heisenberg model (9.44), in that every spin now interacts with every other spin. It 
falls into the class of homogeneous mean-field theories, which are defined by a 
single-site Hilbert space H^ = H = C" and local Hamiltonians of the type 

hA = (10.271) 

Here To = 1„, and the matrices (T;)JLY^ in Mn{C) form a basis of the real vector 
space of traceless self-adjoint n x n matrices; the latter may be identified with i times 
the Lie algebra su(n) of SU («), so that (Tq, Ti,..., ) is a basis of i times the 

Lie algebra u(n) of the unitary group U (n) on C”. In those terms, we define 

m E (10-272) 

I x&A 

Finally, ^ is a polynomial (which is sensitive to operator ordering). For example, to 
cast (10.270) (with T = 1) in the form (10.271), take n = 2, Tj- = f O’,- (= 1,2,3), and 

{Ti,T2,Ti) = -2{T^ +BTi). (10.273) 

The assumptions of Theorem 9.15 do not hold now, and indeed the local dy¬ 
namics (9.40) fails to converge to global dynamics on the quasi-local C*-algebra A 
defined by (8.130). Fortunately, it does converge to a global dynamics on the C*- 
algebra C{S{B)), where B = Mn{C) is the single-site algebra. In order to describe 
the limiting dynamics of (homogeneous) mean-field models as A ^ l/, we equip 
the state space S{B) with the Poisson structure (8.52), which we now elucidate. 

For unital C*-algebras B, we may regard 5(T1) as a w*-compact subspace of either 
the complex vector space B* or the real vector space B*^^-, in the latter case we regard 
states as linear maps (O : B^^ —>■ R that satisfy 0(1^) = 1 and (o{a ")> 0 for each 
a G Bsa- If B = M„(C), which is all we need, we may furthermore identify B*^ with 
m(n)*, and since the value of each state O) G S(M„(C)) is fixed on Tq = 1^ G m(n), 
it follows that S(M„(C)) is a compact convex subset of /su(n)*. In that case, the 
Poisson bracket (8.52) on S(M„(C)) is none other than the restriction of (minus) the 
canonical Lie-Poisson bracket on su(n)* = /su(n)* to S{M„(C)), cf. (3.98) - (3.99). 
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For example, for n = 2 we have S{M 2 {C)) = C by Proposition 2.9, i.e., 

= Tr(p(x,y,x)a) ((x,y,z) G a G M 2 (C)); (10.274) 

p(w) = !(j+”,-_■)). (10.275) 

We also have su( 2 )* = upon the choice of the basis (7) = jCJ,), i = 1,2,3, of 
/su( 2 ), which means that 9(xj,z) ^ isu( 2 )* maps (T\,T 2 ,Tt,) to (x,y,z) (where this 
time {x,y,z) G K^), cf. §5.8). If we now regard the matrices 7) as functions 7)- on B^ 
by 7/(0)) = a)(7]), we find that the corresponding functions on B^ are given by 

fi{x,y,z) = ^x, t 2 {x,y,z) = \y, t 2 {x,y,z) = \z. (10.276) 

The corresponding Poisson brackets (8.52) are {ri,72} = —273 etc., i.e., {x,y} = 
—2z etc.; this is —2 times the bracket defined in (3.43) or (3.97) - (3.98). This factor 
2 could have been avoided by moving to the three-ball with radius r = 1/2 instead 
of r = 1, whose boundary is the coadjoint orbit G\i 2 naturally associated to spin-j. 

We now return to our continuous bundle of C*-algebras of Theorem 8.4, of 
course in the slightly adapted form appropriate to quantum spin systems, see §8.6. In 
particular, we recall that = C(S{B)) and cf. (8.157) - (8.158), 

and hence we see the limit 77 —oo as a specific way of taking the limit A /'l/ 
along the hypercubes Symmetric and quasi-symmetric sequences (ai//v)wGN 
are defined as explained after (8.161). The following observation is fundamental. 

Theorem 10.22. Let B — M„(C). If (ai/A 7 )A(GN (7'i//v))VgN ore symmetric se¬ 
quences with limits oq and Lq as defined by (8.46), respectively (so that (ai//v)A 7 gN 
and (^i/a()/vgS ore continuous sections of the continuous bundle then the se¬ 

quence 

{{oo,ba),i[ai,bi\,...f\A]^\[a^lf^,biiff,---) (10.277) 

defines a continuous section ofA^‘^\ In particular, for each CO G S(B) we have 

/ jim a)l^"l(|A/v|[ai//v,^i/)v]) = {ao,bo}{co). (10.278) 

Proof The proof is a straightforward combinatorial exercise, and we just mention 
the simplest case where d = I and ai//v = 5pA((ai) and 7>i//v = SyN{bi), where 
fli G B and 7>i G B, cf. (8.39). Then aq = di, 7>o = bi, and similarly to (8.45) we find 

[5i,/v(fli),5i,/v(^i)] = ^Bi,/v([fli,^i]), (10.279) 

Using (8.52), we find that (10.277) is equal to {i[a\,b\\,. .. . ..). Since 

®‘^(5i,A(([ai,7>i])) = Co{[ai,bi]), the left-hand side of (10.278) is therefore equal to 
ico{[a\,bi]), which by (8.52) equals the right-hand side. □ 
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In Other words, although the sequence of commutators converges to zero 

(which is why Aq ' has to be commutative!), the rescaled commutators 
converge to the macroscopic observable {ao,bo} € C(S(B)). This reconfirms the 
analogy between the limit N ^ and the limit 0 of Chapter 7, see especially 
Definitions 7.1 and 8.2. With B ~ M„{C), Theorem 10.22 implies the central result 
about the macroscopic (and hence classical!) dynamics of mean-field theories: 

Corollary 10.23. Let ^ continuous section of defined by a sym¬ 

metric sequence, and let {a\iisi)f^^^ be an arbitrary continuous section ofA^‘^^ (i.e. 
a quasi-symmetric sequence). Then, writing hijf^ = hj^^for clarity, the sequence 

(ao{t),e‘'^^i‘aie-‘''''if---e‘''^N‘a^^^e^‘'^^N\...y (10.280) 

where ao{t) is the solution of the equations of motion on S{M„(C)) with classical 
Hamiltonian ho and Poisson bracket (8.52), defines a continuous section of A^‘^\ 

In other words, the Heisenberg dynamics on Aa;^ = B(//a^) defined by the quan¬ 
tum Hamiltonians hj\^ converges to the classical dynamics on the Poisson manifold 
S(M„{C)) that is generated by their classical limit, viz. the Hamiltonian ho. 

For example, since the operators 7)^^^ form symmetric sequences, so do Hamil¬ 
tonians of the type (10.271). The limit ho G C(5(M„(C))) of the family (/ia) in 
(10.271) is simply obtained by replacing the operators in the function h by the 
functions 7) on 5(M„(C)). Equivalently, one may replace the 7)^^^ by the canon¬ 
ical coordinates (0;) of /su(n)* dual to the basis {Ti,...,T^ 2 _i) of /su(n)*, i.e., 
0,(7}) = 5ij, and restricting the ensuing function on /su(n)* to S{M„{C)) C /su(n)*. 
Using (10.276), for the Curie-Weiss model (10.270) with 7=1 this gives 

h^'^{x,y,z) = -lz^-Bx. (10.281) 

The ground states of this Hamiltonian are simply its minima, viz. 

x± = (B, 0 ,±y/l-B 2 ) {0<B< 1); (10.282) 

X = (1,0,0)) (B> 1), (10.283) 

all of which lie on the boundary of B^. Note that the points x± coalesce as B ^ 1, 
where they form a saddle point. Modulo our use of radius r = 1 instead of r = 1/2, 
this result coincides with (10.81) for classical limit of the quantum Ising model. 

We now turn to symmetry and its possible breakdown. Suppose there is some 
subgroup of U (n), typically the image of a unitary representation g Ug of a com¬ 
pact group G on C", under which h{To,T\,..., T^ 2 _i) in (10.271) satisfies 

h{To,UgTiu*g,...,UgT„2_iU*g)=h{To,Ti,...,T„2_,) (gGG). (10.284) 

For example, in the Curie-Weiss model one has G = Z 2 , whose nontrivial element 
is represented by ( 7 i. For (10.271) itself this implies = hfj, cf. (10.69). 
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Hence also in homogeneous mean-field models we obtain the structure (10.57), 
(10.58), and (10.59) familiar from the case of short-range forces. For the limit theory 
this implies that the classical Hamiltonian /to on 5(M„(C)) is invariant under the 
coadjoint action of G C U{n) on i5u(n)*, restricted to 5(M„(C)) C isu(n)*: in the 
Curie-Weiss model this “classical shadow” of the Z 2 symmetry of the quantum 
theory is simply the map (x,y,z) i-J- (x, —y, —z) on B^. 

In the regime 0 < B < 1, the degenerate ground states of this model break this 
symmetry. In contrast, it can be shown from the Perron-Frobenius Theorem (which 
applies since both ff 3 and (7i are real matrices) that for B > 0 each quantum- 
mechanical Hamiltonian (10.270) has a unique ground state ■ Being unique, 
this vector must share the invariance of under the permutation group &n, so that 

N 

W = E c{n+/N)\n+,n^), (10.285) 

H -|-=0 

where |n+,n_) is the totally symmetrized unit vector in with spins up 

and n_ = N — n+ spins down, and c : {0,1 /N,2/N, l)/N, 1} —>■ [0,1] is 

some function such that Y.n+c{n+/N)'^ = 1 (we may assume c > 0 by the Perron- 
Frobenius Theorem). The asymptotic behaviour of c as 00 has been studied, 
and as expected, c to converges pointwise to c(0) = c(l) = ■\/l/2 and c{x) = 0, and 
zero elsewhere (at B = 0 one of course has either c(0) = 1 or c(l) = 1 for all N). 

Thus we encounter a familiar headache; the “higher-level” theory C{S{M„{C))) 
at N = °° breaks the Z 2 symmetry, whereas the “lower-level” quantum theories 
B{Ha^)(N<oo) do not, although the former should be a limiting case of the latter. 
Indeed, the situation for the Curie-Weiss model in the regime 0 < B < 1 is exactly 
analogous to the double-well potential as well as to the quantum Ising model in the 
same regime: if the two degenerate ground states x± G B^ of /z™ are reinterpreted as 
Dirac measures 5± on B^, which in turn are seen as (pure) states (0± on the classical 
algebra of observables C(B(M 2 (C))), then (10.74) holds, mutatis mutandis. 

The resolution of this problem through the restoration of Butterfield’s Principle 
should also be the same as for the previous two cases; there is a first excited state 
such that as N ^ the energy difference with the ground state approaches 
zero and one has approximate symmetry breaking as in (10.75)). Alas, for the Curie- 
Weiss model so far only numerical evidence is available supporting this scenario. 

Equilibrium states of homogeneous mean-field models at any inverse tempera¬ 
ture 0 < j3 < 00 exist, despite the fact that in such models time-evolution at on the 
infinite system A (and hence the KMS condition characterizing equilibrium states) 
is ill-defined (unless one passes to certain representations of A, which would be 
question-begging). Instead, one invokes the quasi-local C*-algebra A, cf. (8.130), 
and in lieu of KMS states looks for limit points d)^ G S(A) of the local Gibbs states 
defined by (9.96) as N see (10.44) and surrounding discussion. Proposi¬ 
tion 10.8 does not apply now, but Theorem 8.9 does: since each local Hamiltonian 
hj\^ is permutation-invariant (because each is), so is each local Gibbs state 

0 )^^, and accordingly, each w*-limit point of this sequence must share this property. 
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As in (8.174), from the quantum De Finetti Theorem 8.9 we therefore have: 

dHa(e)(co^Y, (10.286) 

JS{M„{C}) ^ V 

for some probability measure iJ.p on the single-spin state space 5(M„ (C) ). By Propo¬ 
sition 8.28, this measure may also be regarded as a limit of the local Gibbs states, but 
now regarded as a state on the limit algebra =C(5(M„(C))) rather than as a state 
on Aq^^ = A. By the same token, each state (Og in the decomposition (10.286) is a 

pure state on Aq'^^ (though seen as a state on M„(C) it will be mixed!). The states cOg 
are computed as follows. Given a classical Hamiltonian ho computed from (10.271) 
as explained after Corollary 10.23, for each point 9 — (9o, ■■■, 0„2_i) G m(n)* we 
define a new self-adjoint operator hg G M„{C) by 

rP' — l 

hg=hoie)-ln+ y (10.287) 

to 

For example, in the Curie-Weiss model, from (10.273) we have 

h^^id) = -2(0|+B0i); (10.288) 

h™ = /i^*(0)-203(73 -BcJi. (10.289) 

Eq. (10.287) has the following origin. Let (O be any state on A for which the strong 
limit tY of each operator Ka ) on Ha exists asN (for example, as in the 
proof of Theorem 8.16 one may show that this is the case when O) is a permutation- 
invariant state of A). It easily follows that lies in the algebra at infinity for Tta, 
and hence in the center of TtaiA)”, cf. §8.5. If, in addition, (O is primary, then 

TY = ei-lH^-, (10.290) 

0; = lim ( 0 (tY''^). (10.291) 

N^oo 

Under these assumptions, we compute the commutator 

[KaihA,),naia)]=y^^[TY\...,TY-i)- E 

where a G U^Ayi, and 0{\I\Am\) denotes a finite sum of (multiple) commutators 
between some power of and operators that are (norm-) bounded in N. For 
example, for the Curie-Weiss model the 0{1 /|Aaj|) term is a multiple of 

y [[;r(B(ff3W),7r(B(a)],(73^''^^]. (10.292) 
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Since a is local, all commutators Y.xeAn['^m{Ti{x)),nm{a)\ are in n(t,{A), so that fur¬ 
ther commutators a la (10.292) vanish 2 &N ^ oo. Also, in this limit the terms 
in the argument of <3/to/<90, assume their c-number values 6i, so that 


lim [na){hAf^),na{a)] = [hco,7:a){a)], (10.293) 

N^OO 

where formally (i.e. on a suitable domain) we have an cu-dependent Hamiltonian 

h^= Y, ^Che(x)), (10.294) 

where the 0/ depend on co via (10.291). Also, for each a gA one has strong limits 
lim 7t(o . (10.295) 

N^oo \ / 

Hence in the limit N = °° (provided it makes sense, which it does under the stated 
assumptions), the original mean-field Hamiltonian (10.271) with its homogeneous 
long-range forces converges to a sum of single-body Hamiltonians, in which the 
original forces between the spins have been incorporated into the parameters 0;. 
Returning to (10.286), for any j3 = we now determine Og from the Ansatz 


Tr(e 




Tr{e-P'^e) 


where 0 is found by by solving the self-consistency equation 


(10.296) 


(O^ = 0. (10.297) 

As explained after Corollary 10.23, here (Oq : Mn{C)sa. —>■ R is defined by its val¬ 
ues on /su(n) and hence should be seen as a map /su(n) —>■ M, like 0 G su(n)*, 
so that (10.297) consists of — 1 equations COg(Ti) = 9i (i = l,...,n^ — 1). Al¬ 
ternatively, one may extend 0 from /su(n) to /u(n) by prescribing 0(1„) = 1, and 
subsequently extend it further to M„(C) by complex linearity. Clearly, the constant 
ho{9) in (10.287) drops out of (5.152) and may be ignored in solving (10.297). 

For example, if we take (10.289) with B = 0, then (10.297) forces 0i = 02 = 0, 
whereas the magnetization 203 = m = 0 )q (Gj) satisfies the famous gap equation 

tanh(j3m) = m. (10.298) 

For any j3 this has a solution m = 0, i.e., 0 = 0 in which corresponds to the tracial 
state (o{a) = jTr (a) normally associated with infinite temperature (i.e., j3 = 0). This 
state is evidently Z 2 -invariant. For T > = 1/4 (i.e. j3 < 4) this is the only solution. 

For T < 7). (or j3 > 4), two additional solutions iLmp (with > 0) appear, which 
break the Z 2 symmetry. For B > 0 computations become tedious, but for j3 ^ 
where (Oq converges to the ground state of fiQ, one recovers our earlier conclusions. 
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Proposition 10.24. The self-consistency equation (10.297) has at least one solution. 

Proof. This follows from Brouwer’s Fixed Point Theorem (stating that any contin¬ 
uous map / from a compact compact set K to itself has a fixed point), applied 
to K — S{M„(C)) and /(0) = COg, where 0 G S(M„(C)), as just explained. □ 

The key result on equilibrium states of homogeneous mean-field theories, then, is: 

Theorem 10.25. Let h/^ in (10.271) define a homogeneous mean-field theory with 
compact symmetry group G. The sequence of local Gibbs states defined by 

(9.96) and (10.271) has a unique G-invariant limit point ft)^, whose decomposition 
into primary states is given by (10.286). The G-invariant probability measure pp is 

concentrated on some G-orbit in 5(M„(C)), and the states COg on M„{C) are given 
by (10.296), with Hamiltonians Jiq defined by (10.287), where 0 satisfies (10.297). 

Proof. We just sketch the proof, which is based on the Quantum De Finetti Theorem 
8.9. Each operator is permutation-invariant, which property is transferred first 
to each local Hamiltonian h^f,, thence to each local Gibbs state defined by h/^^, 
and finally to each limit point of this sequence. As already noted. Theorem 8.9 then 
gives the decomposition (10.286), which by Theorem 8.29 (whose assumption holds 
in mean-field models) also gives the primary decomposition of d)^ (i.e., each state 
(Og )” is primary on the quasi-local algebra A). By our earlier argument centered 
on (10.294) - (10.295), time-evolution is implemented in the GNS-representation 
induced by such a state. An important step in the proof—which we omit because 
it requires various reformulations of the KMS condition we have not discussed—is 
that (C0g)°° satisfies the KMS condition with respect to the dynamics (10.295). This, 
in turn, implies (10.296), which, by definition of 0 through (10.290) - (10.291), 
gives the self-consistency condition (10.297). The proof is completed by a tricky 
argument (which again uses alternatives to the KMS condition) to the effect that 
if some cOg breaks the G-symmetry, the probability measure pp on the G-orbit in 

5(M„(C)) through cOg induced by the normalized Haar measure on G, defines the 
only possible limit point of the local Gibbs states, and hence must be unique. □ 

Thus SSB can be detected by solving (10.297) and checking if the ensuing state(s) 
COg on Mn{C) is (are) G-invariant. As we have seen, in the Curie-Weiss model this 
is the case for j3 < 4, whereas for j3 >4 the measure pp in (10.286) is given by 

Mi? = 2 ( ^( 0 , 0 ,mp /2) + ^{Ofi-mp /2 ) )) (10.299) 

where 80 (f) = f (9). In such cases, since each local Gibbs state is invariant, one 
faces the (by now) familiar threat to Barman’s Principle. In response, we expect 
Butterfield’s Principle to be restored through the introduction of asymmetric flea- 
type perturbations to hj\ that are localized in spin configuration space, although at 
nonzero temperature all excited states (rather than just the first) will start to play a 
role, and the precise details of the “flea” scenario remain to be settled. 
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10.9 The Goldstone Theorem 


So far, we have only discussed the simplest of all symmetry groups, namely G = Z 2 , 
which is both finite and abelian. Although it will not change our picture of SSB, 
for the sake of completeness (and interest to foundations) we also present a brief 
introduction to continuous symmetries, culminating in the Goldstone Theorem and 
the Higgs mechanism (which at first sight contradict each other and hence require a 
very careful treatment). The former results when the broken symmetry group G is a 
Lie group, whereas the latter arises when it is an infinite-dimensional gauge group. 

Let us start with the simple case G = SO{2), acting on by rotation. This 
induces the obvious action on the classical phase space i.e., 

R{p,q) = iRp,Rq), (10.300) 

cf. (3.94), as well as on the quantum Hilbert space H — that is, 

ur\i/{x) = \l/{R^^x). (10.301) 


Let us see what changes with respect to the action of Z 2 on M considered in §10.1. 
We now regard the double-well potential V in (10.11) as an 5G(2)-invariant function 
on through the reinterpretation ofx^ as x^+x^- This is the Mexican hat potential. 
Thus the classical Hamiltonian h{p, q) = p^/2m + V (q), similarly with p^ = p\+ P 2 , 
is 5G(2)-invariant, and the set of classical ground states 

^0 = {{p,q) G rR2 \p = Q^q^ = a^} (10.302) 


is the SO(2)-orbit through e.g. the point {pi = P 2 = 0,qi = a,q 2 = 0). Unlike the 
one-dimensional case, the set of ground states is now connected and forms a cir¬ 
cle in phase space, on which the symmetry group SO{2) acts. The intuition behind 
the Goldstone Theorem is that a particle can freely move in this circle at no cost 
of energy. If we look at mass as inertia, such motion is “massless”, as there is no 
obstruction. However, this intuition is only realized in quantum field theory. In quan¬ 
tum mechanics, the ground state of the Hamiltonian (10.6) (now acting on L^(R^)) 
remains unique, as in the one-dimensional case. In polar coordinates (r, we have 


f Id I , 


(10.303) 


with V(r) = IX (r^ — a^)^. With 




(10.304) 


under Fourier transformation in the angle variable, this becomes 


r d^ Id 


hn\l/ir,n)= ^ + -- -^ V^(6«)- 

Im V or^ r or 


(10.305) 
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Since T?'rP'/2mr^ is positive, the ground state has = 0 for all n ^ 0, 

and hence it is 5(9(2)-invariant, since the 5(9(2)-action on L^(K^) becomes 

M 0 V/(r,n) = exp(/n0)v/(r,n), (10.306) 

after a Fourier-transform. Indeed, from a group-theoretical point of view, the unitary 
isomorphism (10.304) is nothing but the decomposition 

(10.307) 

mGZ 


where //„ = L^(K+) for all n, but with G Hn transforming under 5(9(2) as 

=exp(/n0)^„(r) (0 G [0,2;r]). (10.308) 

The 5(9(2)-invariant subspace of L^(]R^), then, is precisely the space Hq in which 
lies. This is analogous to the situation occurring in one dimension higher (i.e. 
K^) with e.g. the hydrogen atom: in that case, the symmetry group is 5(9(3), and 
L^(K^) decomposes accordingly as 

l2(]r3) ^0//^-; (10.309) 

jeN 

Hj = (10.310) 

The ground state for a spherically symmetric potential, then, lies in Hq and is 5(9(3)- 
invariant. For our purposes the relevant comparison is with the one-dimensional 
case: the decomposition of L^IR) under the natural Z 2 -action m_i (//(x) = i/(—x) is 

L^{R)=Ho®Hi (10.311) 

Hi = {\j/GL^iR) I wix) = i-lYwi-x)}, i = 0,l. (10.312) 

This time, H+ is the Z 2 -invariant subspace containing the ground state Being 

Z 2 -invariant, xj/^^ is has peaks above both classical minima ±a; in fact, y/® is real¬ 
valued and strictly positive. The ground state of the corresponding two-dimensional 
system, seen as an element of L3(]g2^^ is just this wave-function extended from 
K to by rotational invariance. Hence the ground state remains real-valued and 
strictly positive, with peaks about the circle of classical minima in 

Let us recall the situation for d = I (cf. §10.1). The first excited state lies 
in Hi; it is real-valued, like xj/^Y but since it has to satisfy x) = — (//^(x), 

it cannot be positive. Indeed, with a suitable choice of phase, has one positive 
peak above a and the same peak but now negative below —a. Then the wave-function 

V/± = (v/r±V/r)V2, (10.313) 
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is peaked above ±a alone (i.e., the negative peak of ±below =Ffl exactly cancels 

the corresponding peak of The classical limit of comes out as the mixed 
state ^(cOq^ + COq), where = (p = 0,±a), but each state Xj/^ has the pure state 
0 )^ as its classical limit. The latter are ground states, and hence in particular they 

are time-independent, because the energy difference between and 

\j/^^ vanishes (even exponentially fast) ash^Q. 

A similar but more complicated situation arises ind = 2. The role of the pair 

(v/rG//o,V^i'^G//l) 

is now played by an infinite tower of unit vectors 

^ ^ G , 

where is the lowest energy eigenstate (for hp, in (10.305)) in H„ C The 

analogue of the states \j/^ for t/ = 1 involves a limit which heuristically is like 


but this limit does not exist in As in §10.1, we instead rely on the technique 

explained around (10.4), which makes the unit vectors converge to some 

probability measure /r® on as A —>■ 0 °. In the subsequent limit h^O, one obtains 
a probability measure /r® concentrated on a suitable point in the orbit of classical 
ground states (10.302). Similarly, in the same sense the ground state converges 
to a probability measure supported by all of Sq. 

To the extent that there is a Goldstone Theorem in classical mechanics, it would 
state that motion in the orbit Sq is free. That is, at fixed (r = a,pr = 0), where pr is 
the radial component of momentum, one has an effective Hamiltonian 




2ma^ ’ 


(10.315) 


whose time-independent states {p^ = O,0o) for arbitrary 0o G [0,27r) yield the 
ground states of the system, and whose “excited states” 

(p^(f),0(f))= (p^(O),0(O)) + ^^) (10.316) 


give motion along the orbit ^0 with effective mass ma^, whose energy converges 
to zero as —?> 0. However, since massless particles (whose existence is the main 
conclusion of the usual Goldstone Theorem) are not defined in classical mechanics, 
we now turn to relativistic field theory (with which we assume some familiarity). 
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We now illustrate SSB in classical field theory through a simple example, where 
the symmetry group is G = SO{N), but whenever write things down in such a way 
that the generalization to arbitrary scalar field theories is obvious. Suppose we have 
N real scalar fields (p = {(pi(p^), on which SO{N) acts in the defining represen¬ 
tation on Following the physics literature, from now on we sum over repeated 
indices like i and pi {Einstein summation convention). Let the Lagrangian 

^=\d^,(pid^(pi-V{(p), (10.317) 

contain an 5G(A^)-invariant potential V, typically of the form (with (p^ = (pf) 

V((P) = -Y<P +4<P^ (10-318) 

where A > 0, but may have either sign. If n? < 0, the minimum of V lies at 
(p =0, but if > 0 the minima form the 5G(A^)-orbit through 

(p^ = (10.319) 

v = m/VX = \\(p‘^\\. (10.320) 

The idea is that the physical fields are excitations of the “vacuum state” <p^, so that, 
instead of <p, as the appropriate “small oscillation” field one should use 

X{x) = (p{x)-(p^. (10.321) 

Consequently, the potential is expanded in a Taylor series for small x as 

V{(p) = V((p^) + iV"z,Z, + G(x3); (10.322) 

Note that the linear term vanishes because V'((p'^) = 0. We now use the SO{N)- 
invariance of V, i.e., V{g(p) = V{(p) for all g G SO{N). For G g (i.e. the Lie 
algebra of G, realized by anti-symmetric traceless N xN matrices) this yields 

jV(e‘^^(p)t=o = 0 4^ ^^^(Ta)ij(pj = 0. (10.324) 

Differentiation with respect to (ph^ and putting ip = ip^ then gives 

v;iiTa)ij(p^j=0. (10.325) 

In general, let H C G be the stabilizer of (p^, i.e., g G H iff g(p‘^ = <p^. In our exam¬ 
ple (10.318) - (10.319), we evidently have H = SO{N — 1). Then Ta(p‘^ = 0 for all 
generators Ta of the Lie algebra tj of H, so that there are 

M = dim(G)-dim(//) =dim(G///) =dim(G-^^) (10.326) 
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linearly independent null eigenvectors of V" (seen as an x matrix). This number 
equals the dimension of the submanifold of where V assumes its minimum. In 
our example we have M = N—I, since dim(5(9(A^)) = lN{N— 1). We now perform 
an affine field redefinition, based on an affine coordinate transformation in that 
diagonalizes the matrix V”. The original (real) fields were (p = {(pi(p^), and the 
new (real) fields are {xi , 02, • ‘ ‘ , 0A'), with 


Xi=(Pi-v, 


(10.327) 


as in (10.321), and the Goldstonefields are defined, also in general, by 

Qa = -W, <P) = -{Ta)ij(p‘j(Pi. (10.328) 

V V 

Here (•, •) denotes the inner product in and we have chosen a basis of g in which 
the elements (Ti,..., ) form a basis of Ij, completed by M further elements 

(//)+!) ■•■ ^dim(G)+i)’ ^o as to have basis of g. The index a in (10.328), then, 
runs from dim(//) + 1 to dim(G), so that there are M Goldstone fields, cf. (10.326). 
In our running example, this number was shown to be M = A^ — 1, and in view of 
(10.319), the field 9a = iTa)ii(Pi is a linear combination of the (p 2 till (p^. 

The simplest example is A^ = 2, with potential (10.318) and > 0. With the 
single generator T = —1(72, we obtain 9 = (pz- Since V" = diag(2m^,0), we see 
that the mass term —^m^(Pi in (10.318) (with (p^ = (p^ + (p^) changes from the 
“wrong” sign —m^ to the ‘right’ sign +2m^ in (10.322), whilst — (10.318) 

disappears, so that the field 9 comes out to be massless. Indeed, this is the point 
of the introduction of the Goldstone fields: in view of (10.325) and (10.328), the 
Goldstone fields do not occur in the quadratic term in (10.322) and hence they are 
massless, in satisfying a field equation of the form (9^(9^ 0a = • • •, where • • • does not 
contain any term linear in any field. This proves the classical Goldstone Theorem: 

Theorem 10.26. Suppose that a compact Lie group G C SO{N) acts on N real 
scalar fields (p = {(pi,... ,(p]^), leaving the potential V in the Lagrangian (10.317) 
invariant. If G is spontaneously broken to an unbroken subgroup H G G (in the 
sense that the stability group of some point (p‘^ in the G-orbit minimizing V is H), 
then there are at least dim{G/H) massless fields, i.e., there is afield transformation 

{(Pi,...,(Pn)^ (M = dim(G)-dim(//)), (10.329) 

that is invertible in a neighborhood of (p ~ (p‘^, such that the potential V{(p), re¬ 
expressed in the fields % and 9, has no quadratic terms in 9. 

The local invertibility of the field redefinition around 0 is crucial; in our ex¬ 
ample, where x = Xi = *Pi ~ ^ ^nd 0^ = T-^tpi, this may be checked explicitly. 

An alternative proof of Theorem 10.26 uses nonlinear Goldstone fields, viz. 

^(x) (10.330) 
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where the sum over a (implicit in the Einstein summation convention) ranges from 1 
to M,v— II^'^11, and the fields x = (Zb ■ • • ,Xn-m) are chosen orthogonal (in K^) to 
each TaCPc, a = 1,... ,M, and hence to the Oa- Provided that the generators of SO{N) 
(and hence of G C SO{N)) have been chosen such that 

(7;(p^7i(p^)=vV^ (10.331) 

the fields 0“ defined by (10.330) coincide with the fields in (10.328) up to quadratic 
terms in x and 0; to see this, expand the exponential and also use the fact that both 
{Ta(p‘^, (p‘^) and {Ta(p‘^,x) vanish. This transformation is only well defined if v ^ 0, 
i..e., if SSB from G to H occurs, and its existence implies the Goldstone Theorem 
10.26, for by (10.330) and G-invariance, V(<p) is independent of 0. 

The Goldstone Theorem can be derived in quantum field theory, but in the spirit 
of this chapter we will discuss it rigorously for quantum spin systems. Far from 
considering the most general case, we merely treat the simplest setting. We assume 
that A is a quasi-local C*-algebra given by (8.130), with N = C”. Furthermore: 

1. The group of space translations Z'^ acts on A by automorphisms T_t, and so does 
the group K of time translations by automorphisms oCt commuting with the Tx (cf. 
§9.3); we often write (X(x,t) for O; o as well as a{x,t) for cCt o Tx(a). 

2. A compact Lie group G acts on // = C” through a unitary representation u and 
hence acts on on A by automorphisms jg as in (10.58) - (10.59), such that 

7g°cC{x,t) = cC{x,t)°7g {{x,t)€Z‘^ xR,g€G). (10.332) 

3. There exists a pure translation-invariant ground state co. 

4. One has SSB in that cu o 7 ^ ^ CU for all g € Ga C G, where 

G^ = {exp(i7;),iGK,7;Gg}. (10.333) 

5. There is an n-tuple (p = {(pi,<Pn) of local operators (pa G M„{C) that trans¬ 

forms under Ghy (p Ugtpu* = Yg{(p), and defines an order parameter (jta by 

= 5a^=j^ (7exp(.r,)(^))|,=0’ (10.334) 

at least for SSB of Ga (as above) in that, cf. Definition 10.6, 

co(da(p)^0. (10.335) 

6 . Writing = iu'(Ta) G M„(C), it follows that da<p = tp], and hence that 

Sa(p{x) = -i lim Y. {x G Z‘‘), (10.336) 

since by (8.132) (i.e., Einstein locality) only the termy =x will contribute. Physi¬ 
cists then wish to define a charge by Qa = T.y£Z‘‘ Jaiy) and write (10.336) as 
Sa(p(x) = —i[Qa,(p{x)], but Qa does not exist precisely in the case of SSB! 
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Eq. (10.336) motivates the crucial assumption for the Goldstone Theorem, viz. 
co{5a(p{x,t)) =-i lim V a)([;°(y),^(x,t)]) (xG G K), (10.337) 

which incorporates the condition that the sum over y converge absolutely. 
Although (10.337) at first sight softens (10.336) in turning an operator equation 
into a numerical one, in fact (10.337) decisively sharpens (10.336) by involving 
the time-dependence of (p, whose propagation speed should be sufficiently small 
for enabling the limit in (10.336) to catch up with the limit in (10.337). As such, 
eq. (10.337) is satisfied with short-range forces, but the Meissner effect in su¬ 
perconductivity and the closely related Higgs mechanism in gauge theories (both 
of which circumvents the Goldstone Theorem) are possible precisely because in 
those cases (10.337) fails (at least in physical gauges, see also §10.10). 

7. Finally, we make two assumptions just for convenience, namely 

(paix)* = (Paix); (10.338) 

(0{(pa{x)) = 0. (10.339) 

If these are not the case, one could simply take real and imaginary components 
of (pa and/or redefine (pa as (pa = (Pa — (o{(Pa) ■ 1 a, so that (o{(pa{x)) = 0. 

The Goldstone Theorem provides information about the joint-energy momentum 
spectrum of the theory at hand. To define this notion, we exploit the fact that from 
assumption no. (3) and Corollary 9.12 we obtain a unitary representation of 
the (locally compact) abelian space-time translation group A = x K on the GNS- 
representation space Ha induced by (O. The SNAG-Theorem C.l 14 applied to A, with 
dualA = T‘' X K (cf. Proposition C.108), then yields a projection-valued measure 

ea:^(MxT‘‘)^ ^{Ha), (10.340) 

as a map from the Borel sets in R x to the projection lattice in B{Ha), such that 

= f rde{E,k)- (10.341) 

JT'* JO 

ua{yf)= f f de{E,k)e‘^^*-^''^'^ (yGZ^,fGM). (10.342) 

JTi Jo 

Here k = {k\,... ,l^),y-k — Yfi= i yth, and we have reduced the integration range over 
E (which a priori would be M) to M+. Indeed, by Stone’s Theorem we have Ua{t) = 
exp{itha), where (y{ha) C [0,°°) because O) is a ground state by assumption, and 
the support of e is evidently contained in if x (y{ha) (cf. Definition A. 16). 

Definition 10.27. The joint energy-momentum spectrum (j{ha,Pa)) of a space- 
time invariant state (0 (i.e., = ft), (x,f) G Z'^ x R) is the support of the 

projection-valued measure e® associated to the Gt^S-representation Tta, i.e., the 
smallest closed set C7{ha),P(o) C x R such that e((T^ x M.)\G{hco,Pa>)) = 0. 
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The notation a{h(o,P(o) is purely symbolic here, since (as opposed to the continuum 
case) the group of spatial translations is discrete and hence has no generators p^- 
Since U(o{x,t)£2co = the origin (0,0) certainly lies in a{h(o,pco), with 

e„(0,0) = (10.343) 

which by Theorem 9.14 is the unique x M-invariant state in Ha. Denoting this 

contribution to by ea\ in many physical theories one has e® = + e^a H-> 

where is supported on the graph of some continuous function ki-^ ek>Q, i.e., 

{{k,ek),kG T^} c a{ha,Pa) C X M. (10.344) 

The joint energy-momentum spectrum may be studied in part by considering 

fie,p)= E r 

yelA-’-'" 

= f [ {{n(o,7:(o{jai0))de(oiE,k)n(o{(Pa{0))i2(o)S{e-E)5{p-k) 
Jt‘‘ Jo 

- {Qana{(pa{0))dea{E,k)na{fM)^m)5{e+E)d{p + k)), (10.345) 

i.e., the Fourier transform of the two-point function defined by and (p, which is 
a distribution on the dual group x K; for the third equality we used a distribu¬ 
tional version of the Fourier inversion formula (C.382). For example, if we replace 
e(o{E,k) by {E,k), then, since is absolutely continuous with respect to Haar 
measure d‘^k on T'^, we see that f{£,p) is proportional to 5(e — Cp). 

Theorem 10.28. Under assumptions 1-7 (notably (10.337) and SSB of some contin¬ 
uous symmetry), the Hamiltonian ha has continuous spectrum starting at zero and 
hence has no gap. If there is an excitation spectrum Ca^ as explained above, with 

J {Qa,7taUaiO))de‘'a\E,k)7ta{(PamQa)^0, (10.346) 

then the continuous function ki-^ defining the spectrum satisfies £o = 0. 

Proof. Since the sum in (10.337) converges absolutely, the Fourier transform f{t,p) 
of y I— (o{[j^{y),(p{xf)]) in y alone is continuous in p, and by (10.337) we have 

ico{5a(p{xf))=f{t,0). (10.347) 

By (10.332), the left-hand side is independent of x and f, hence the Fourier transform 
/(e,0) of the right-hand side in t is proportional to 5(e). Since (10.343) does not 
contribute to / by (10.339), the calculation (10.345) shows that /(e,0) = 0 if <j{ha) 
has a gap. But /(e,0) / 0 by (10.335), and so ofia) has no gap. Similarly, for the 
final claim note that /(e,0) ~ 5(e — eo) as well as /(e,0) ~ 5(e). □ 
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10.10 The Higgs mechanism 


We proceed to a discussion of SSB in gauge theories, especially with an eye on the 
Higgs Mechanism, which plays a central role in the Standard Model of high-energy 
physics (whose empirical confirmation was more or less finished with the discovery 
of the Higgs boson at CERN, announced on July 4, 2012). 

We look at the Abelian Higgs Model, given by the Lagrangian 

^ = + (10.348) 

where (p = {(pi , ^ 2 ) is a scalar doublet, the usual electromagnetic field strength is 


F/iv — ~ <5v^/r7 (10.349) 

in terms of which = F^yF^^, and the covariant derivative is 

D^^=d^-eA^,-T = d^-l2 + ieA^ ■ 02 - (10.350) 

Here e is some coupling constant, identified with the unit of electrical charge. We 
still assume that V only depends on \\(p\\^ = {(p, (p) and hence is 5(9(2)-invariant. 

The novel situation compared to (10.317) and the like is that, whereas (10.317) is 
invariant under global 5(9(2) transformations, the Lagrangian (10.348) is invariant 
under local 5(9(2) gauge transformations that depend on x, namely 


/ \ n(r'\-T ^ \ f cosa(x) — sina(x) 

(p(x) 1 (p(^x) = . ; ( f / 

^ ^ ' V sina(x) cosa(x) 


Au(x) A„(x) -I- -daa{x). 

e 


( 9i (.^) \ . 


(10.351) 

(10.352) 


We say that the local gauge group = C°°(K^,t/(l)) acts on the space of fields 
(A,^) by (10.351) - (10.352). Now suppose V has a minimum at some constant 
value (p^ 0. In that case, any field configuration 


(p{x) = exp(a(x) ■T)(p‘^; (10.353) 

A^(x) = {l/e)d^a{x)) (a G ^), (10.354) 


minimizes the action. Hence the possible “vacua” of the model comprise the 
(infinite-dimensional) orbit Y of the gauge group through (A = 0, ^ = (p‘^). Note 
that ly^cp = 0 for (A, <p) G f", i.e., <p is covariantly constant along the vacuum orbit 
(whereas for global symmetries it is constant full stop). Relative to the (arbitrary) 
choice (0, (p‘^) G Y, we then introduce real fields X ^nd 0, called the Higgs field and 
the would-be Goldstone boson, respectively, by (10.330), which now simply reads 


{(Pi{x)\ _ Jeixyr {v + x{x)\ 
[(P2{x))-^ A 0 )■ 


(10.355) 
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After this redefinition of the scalar fields, the Lagrangian (10.348) becomes 

^ = -\FB + hdnXd^X+\e^{v + X?B^,B^-V{v + x,Q), (10.356) 

where — {\/ev)d^0, and for F^v = d^By — dvB^. This de¬ 
scribes a vector boson B with mass term \m\BfiB^, with m| = > 0 (as op¬ 

posed to the massless vector field A), and a scalar field % with mass term 
with = {d^V /<30f)|(v,o) > 0 (since V supposedly has a minimum at (p'^ = (v,0)). 

This is the Higgs mechanism: the gauge field becomes massive, whilst the mass¬ 
less (“would-be”) Goldstone boson disappears from the theory: it is (allegedly) 
“eaten” by the gauge field. Thus the scalar degree of freedom 9 that seems lost 
is recovered as the longitudinal component of the massive vector field (which for a 
gauge field would have been an unphysical gauge degree of freedom, see below). 

In the description just given, the Higgs mechanism in classical field theory is 
seen as a consequence of SSB. Remarkably, there is an alternative account of the 
Higgs mechanism, according to which it has nothing to do with SSb! Namely, we 
now perform a field redefinition analogous to (10.355) etc. straight away, viz. 

(Piix)\ _ e{xyT 
<P2{x))-^ A 0 
A^=B^F{\/e)d^e. 

This transformation is defined and invertible in a neighbourhood of any point 
(po, 9o,Bo, where po >0, 9o G n), and Bo is arbitrary. Each of these new fields 
is gauge-invariant: for the gauge transformation (10.351) becomes 

d{x)^d{x) + a{x)-, (10.359) 

p{x)^p{x), (10.360) 

and in view of (10.352), B does not transform at all. The Lagrangian becomes 

= -\Fb + 2 + y^P^B^B^^-Vip), (10.361) 

with y(p) = y(p,0). This is a Lagrangian without any internal symmetries at all 
(not even Z 2 , since p > 0), but of course one can still look for classical vacua that 
minimize the energy and hence the potential y(p). If p = 0 is the absolue mini¬ 
mum, then the above field redefinition is a fortiori invalidated, but if y'(v) = 0 for 
some V > 0, we proceed as before, introducing a Higgs field x{x) = p(x) — v, and 
recovering the Lagrangian (10.356). This once again leads to the Higgs mechanism. 

This can be generalized to the nonabelian case; since it suffices to explain the 
idea, we just discuss the SU{2) case. In (10.348), the scalar field (p — ((pi,(p 2 ) is now 
complex, forming an SU (2) doublet, the brackets (•, •) now denote the inner product 
in C^, the nonabelian gauge field is A =A‘‘o'a (where the Pauli matrices CTa, a = 
1,2,3, form a self-adjoint basis of the Lie algebra of SU{2)), with associated field 
strength F^y = d^Ay — dyA^ -|-g[A^,Av] and covariant derivative — 3^+ igA^. 


(10.357) 

(10.358) 
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With = F°yFa^, the Lagrangian (10.348) is invariant under the transformations 

(p{x) ^ (10.362) 

A^{x) ^ (10.363) 

The definition of the gauge-invariant fields B and p a la (10.357) - (10.358) is now 


{(pi{x)\ _ ie,{xya, {p{x)\_ 

{(Piix))-'^ V 0 j’ 

A^{x) = 


(10.364) 

(10.365) 


which leads, mutatis mutandis, to the very same Lagrangian (10.361). 

As a compromise between these two derivations of the Higgs mechanism, it is 
also possible to fix the gauge by picking the representative {(p,A) in each 1^-orbit for 
which (p 2 {x) =0 and (pi (x) > 0; note that this so-called unitary gauge is ill-defined 
if (pi (x) = 0. Calling this unique representative {p,B), we are again led to (10.361). 

Gauge field theories are constrained systems, in which the apparent degrees of 
freedom in the Lagrangian are not the physical ones. For free electromagnetism, 
the Lagrangian is J§?(A) = —^F^yF^'', with F^y = d^Ay — dyA^. In terms of the 
gauge-invariant fields £, = 7)o = <5(^0 — doAi and B = V x A, Maxwell’s equations 


V-E = 0; 


dE/dt = VxB-, 


dt 


-VxE; 


V-B = 0, 


(10.366) 

(10.367) 

(10.368) 

(10.369) 


then arise as follows: eqs. (10.366) and (10.367) correspond to the Euler-Lagrange 
equation for Aq and A,, respectively, whereas (10.368) and (10.369) immediately 
follow from the definitions of B and E in terms of A. The Maxwell equations are in 
Hamiltonian form, with canonical momenta = d££ / dAf^; this yields Fit = —Ft, 
as well as the primary constraint FIq — 0. Nonetheless, the canonical Hamiltonian 

h = Jd^x (fT^(x)A|j(x) — .5f(x)) = Jd^x{^E^{x) + j'B'^{x)—A o{x)'V ■E{x)) 

is well defined. In the Hamiltonian formalism. Gauss’ Law resurfaces as the sec¬ 
ondary constraint stating that the primary constraint be preserved in time, viz. 

Sh 

no{x) = -^--^=V-E{x) = 0. (10.370) 

dAo(x) 


Since 


d 

dt 


V-E(x) 


—di{dh/5Ai{x)) = —di{AAi — diV ■ A) = 0, 


(10.371) 
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there are no “tertiary” constraints. Thus we have canonical phase space variables 
(E, A) and {IIqjAq), subject to (10.366) and to T/q W = 0 for each x G i.e., 

no(Ao) = J d^xno{x)Xo{x) = 0; (10.372) 

n{X) = Jd^xV-E{x)X{x)=0, (10.373) 

for all (reasonable) functions Ao and A on The constraints (10.372) - (10.373) 
are first class in the sense of Dirac, which means that their Poisson brackets are 
equal to existing constraints (or zero). In the Hamiltonian formalism, the role of the 
space-time dependent gauge transformations of the Lagrangian theory is played by 
the canonical transformations generated by the first class constraints, i.e.. 


5aoAo(x) = {I/o(Ao),Ao(x)} = Ao(x); 

(10.374) 

= ^XaEiix) = 0; 

(10.375) 

5;i.A(x) = VA(x); 

(10.376) 

5aE(x)=0; 

(10.377) 

=0. 

(10.378) 


The holy grail of the Hamiltonian formalism is to find variables that are both 
gauge invariant and unconstrained. In our case, = (Aq, A) are unconstrained but 
gauge variant, whilst = (T/q) ~E) are gauge invariant but constrained! Now write 
some vector field V as V = + V^, where V(V • V) is the longitudinal 

component, so that VA = ( 5,7 — didj)Vj is the transverse part. Then the physical 

variables of free electromagnetism are Afi and E^. The physical Hamiltonian 

h=\j d^x{E^-AA^), (10.379) 

then, is well defined on the physical (or reduced) phase space, which is the subset 
of all (AjU,T/^) where the constraints (10.373) hold, modulo gauge equivalence. 

After this preparation, we now revisit the abelian Higgs model as a constrained 
Hamiltonian system. It is convenient to combine the two real scalar fields (pi and (p 2 
into a single complex scalar field (p — {(pi + itpi) js/T., and treat (p and its complex 
conjugate Ip as independent variables. The Lagrangian (10.348) then becomes 

A^ = -\Fi+^-D^^(p-V{(p,-cp), (10.380) 

with ly^tp = {d^ — ieA^)<p, etc. The conjugate momenta T/^ to A^j are the same as 
for free electromagnetism, i.e., LIq = 0 and fl, = —E,, and for (p we obtain 

% = d££ ld(p=l^-, (10.381) 

% = d££ ld1p=D^(p. (10.382) 
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The associated Hamiltonian h is equal to 

Jd\ (iE2+iB2-Ao(V.E-;o)+^7r + D^-Z)f^ + y(9,f)) , (10.383) 

where Jq = ie{n<p — TUp) is the zero’th component of the Noether current. Hence 
the primary constraint remains FIq = 0, but the secondary constraint picks up an 
additional term and becomes V • E = jo (which remains Gauss’ law!). The physical 
(i.e., gauge invariant and unconstrained) variables can be computed as 


(Pa 

Ka 


= Ip^ = ; 


(10.384) 

(10.385) 


plus the same transverse fields and E^, as in free electromagnetism. In terms of 
the transverse covariant derivative DJ = di — ieAj, the physical Hamiltonian h is 

jcPx - -PJ ■ APJ - j^A-^i^) + %A%A+ Dj(pA-Of (Pa+V{(Pa,Va)) ■ 

(10.386) 

The third term in (10.386) is the Coulomb energy, in which the charge density 

= ie{%A(PA-TiA^A) (10.387) 


is the same as jo (since the latter is gauge invariant). Remarkably, the physical field 
variables carry a residual global t/(l)-symmetry, viz. 


(Pa exp (la) (Pa', (10.388) 

TtA^-^ exp{—ia)nA', (10.389) 

IpA '->■ exp{—ia)(pA', (10.390) 

nA^-^ exp{ia)nA, (10.391) 


and no change for and E^, under which the Hamiltonian (10.386) is invariant. 
If V has a minimum at ^ ^ = v, we recover the Higgs mechanism: redefining 


= exp(/0/v)(v + x), 


(10.392) 


and complex conjugate, and the reintroduction of the longitudinal components 

Af = -{l/ev)die; Ef = -evA-'^di%e, (10.393) 


of the gauge field and its conjugate momentum, the Hamiltonian (10.386) becomes 

1 Jd^x ^E^ + + diXdiX + ^^2^^ + e^v^A^ + l^(v + , (10.394) 

where A = A^ + A^ and E = E^ +E^. This describes a massive vector field, and 
the would-be Goldstone boson 9 has disappeared, as befits the Higgs mechanism! 
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It is fair to say that the Higgs mechanism in quantum field theory—and more 
generally, the notion of SSB in gauge theories—is poorly understood. Indeed, the 
entire quantization of gauge theories is not well understood, except at the perturba¬ 
tive level or on a lattice. The problems already come out in the abelian case with 
d = 3. The main culprit is Gauss’ Law V • E = jo. One would naively expect this 
constraint to remain valid in quantum field theory as an operator equation, and this is 
indeed the case in so-called physical gauges like the Coulomb gauge (i.e. = 0). 

If we now look at condition (10.337) in §10.9, which for G = t/(l) and for example 
d(pi — (p 2 and 8 (p 2 = —<pi for a charged field (p = {(pi + i(p 2 )(V 2 , or d(p = i(p, reads 

lim [ d^y(o{[jo{y,0),(pa{x,t)]) =-i( 0 {d(pa{x,t)), (10.395) 

a^r3 Ja 

then it is clear that (10.395) can only hold if charged fields are nonlocal. For by 
Gauss’ Law the commutator [yo(T)O))^a(3c,0] equals [V • E(0,y),^c((x,f)], and by 
Gauss’(!) Theorem in vector calculus, all contributions to the left-hand side of 
(10.395) come from terms [£,(0,y), ^a(x,t)], with y G dA (i.e., the boundary of 
A). These must remain nonzero if A ^ at least if (10.395) holds. On the other 
hand, such nonlocality must be enforced by massless fields, which idea leads to one 
of the very few rigorous result about the Higgs mechanism (in the continuum): 

Theorem 10.29. In the Coulomb gauge the following conditions are equivalent: 

• The electromagnetic field A is massless; 

• Eq. (10.395) holds for any field (pa; 

• The charge operator Q — lim^^ij^s d^y jo{y,0) exists (on some suitable domain 

in H(q containing 12®) and satisfies QQm = 0- 

Hence (contrapositively), SSB ofU(l) by the state CO is only possible if A is massive. 
In that case, the Fourier transform of the two-point function ( 0 |^a(x,xo) 7 o(y,yo)| 0 ) 
(cf. the proof of the Goldstone Theorem 10.28 in %10.9} has a pole at the mass of A. 

This theorem indeed yields the Higgs mechanism for say the abelian Higgs model 
in a specific physical gauge: note that the idea that the would-be Goldstone boson is 
eaten by the gauge field is already suggested by Gauss’ Law, through which (minus) 
the canonical momentum E to A acquires jq as its longitudinal component; that is, 
the very same field that creates the Goldstone boson from the ground state. 

In covariant gauges, all fields remain local, but (10.395) is rescued by the gauge¬ 
fixing term added to the Lagrangian. For example, adding Afgf = — (l/2^)((9|jA^)^ 
to (10.348) leads to an equation of motion ^g^Fy = jy — dyd^A^, so that (discarding 
all surface terms by locality), one obtains 

-i(o{5(pa{xf))= [ d^y(o{[dQAo(y,0),(pa{xf)]). (10.396) 

In the proof of the Goldstone Theorem, the massless Goldstone bosons do emerge, 
but they turn out to lie in some “unphysical subspace” of Ho (which, for local 
gauges, is not a Hilbert space but has zero- and negative norm states). 
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Notes 


In a philosophical context, the notion of emergence is usually traced to J.S. Mill 
(1843), who drew attention to ‘a distinction so radical, and of so much importance, 
as to require a chapter to itself’, namely the one between what Mill calls the prin¬ 
ciple of the ‘Composition of Causes’, according to which the joint effect of several 
causes is identical with the sum of their separate effects, and the negation of this 
principle. For example, in the context of his overall materialism. Mill believed that 
although all ‘organised bodies’ are composed of material parts, 

‘the phenomena of life, which result from the juxtaposition of those parts in a certain man¬ 
ner, bear no analogy to any of the effects which would be produced by the action of the 
component substances considered as mere physical agents. To whatever degree we might 
imagine our knowledge of the properties of the several ingredients of a living body to be 
extended and perfected, it is certain that no mere summing up of the separate actions of 
those elements will ever amount to the action of the living body itself.’ 

Mill (1952 [1843], p. 243) 

Mill launched what is now called British Emergentism (Stephan, 1992; McLaugh¬ 
lin, 2008; O’Connor & Wong, 2012), a school of thought which seems to have ended 
with C.D. Broad, who has our sympathy over Mill because of the doubt he expresses 
in our quotation in the preamble. Among the British Emergentists, the most modern 
views seem to have been those of S. Alexander, who, as paraphrased in O’Connor 
& Wong (2012), was committed to a view of emergence as 

‘the appearance of novel qualities and associated, high-level causal patterns which cannot be 
directly expressed in terms of the more fundamental entities and principles. But these pat¬ 
terns do not supplement, much less supersede, the fundamental interactions. Rather, they 
are macroscopic patterns running through those very microscopic interactions. Emergent 
qualities are something truly new (...), but the world’s fundamental dynamics remain un¬ 
changed.’ 

Alexander’s idea that emergent qualities ‘admit no explanation’ and had ‘to be ac¬ 
cepted with the “natural piety” of the investigator foreshadowed the later notion 
of explanatory emergence. Indeed, philosophers distinguish between ontological 
and epistemological reduction or emergence, but ontological emergence seems a 
relic from the days of vitalism and other immature understandings of physics and 
(bio)chemistry (including the formation of chemical compounds, which Broad and 
some of his contemporaries still saw as an example of emergence in the strongest 
possible sense, i.e., falling outside the scope of the laws of physics). Recent liter¬ 
ature, including the present chapter, is concerned with epistemological emergence, 
of which explanatory emergence is a branch. For example, Hempel wrote; 

‘The concept of emergence has been used to characterize certain phenomena as ‘novel’, and 
this not merely in the psychological sense of being unexpected, but in the theoretical sense 
of being unexplainable, or unpredictable, on the basis of information concerning the spatial 
parts or other constituents of the systems in which the phenomena occur, and which in this 
context are often referred to as “wholes”.’ (Hempel, 1965, p. 62) 

See also Batterman (2002), Bedau & Humpreys (2008), Norton (2012), Silberstein 
(2002), Wayne (& Arciszewski (2009), and many other surveys of emergence. 
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§10.1. Spontaneous symmetry breaking: The double well 

The facts we use about the double-well Hamiltonian may be found in Garg (2000) 
or Landau & Lifshitz (1977) at a heuristic level (but with correct conclusions), or, 
rigorously, in Reed & Simon (1978), Simon (1985), Helffer (1988), and Hislop & 
Sigal (1996). Theorem 10.2 is Theorem XIII.47 in Reed & Simon (1978). 

§10.2. Spontaneous symmetry breaking: The flea 

The flea perturbation and its effect on the ground state were first described in 
Jona-Lasinio, Martinelli, & Scoppola (1981a,b), who used methods from stochas¬ 
tic mechanics. See also Claverie & Jona-Lasinio (1986). Using more conventional 
methods, their results were reconfirmed and analyzed further by e.g. Combes, Duc- 
los, & Seiler (1983), Graffi, Grecchi, & Jona-Lasinio (1984), Helffer & Sjostrand 
(1985), Simon (1985), Helffer (1988), and Cesi (1989). The “Flea on the Elephant” 
terminology used by Simon (1985) motivated the title of Landsman & Reuvers 
(2013), who, as will be explained in the next chapter, identified the proper host 
animal as a cat. All pictures in this section are taken from the latter paper (and 
were prepared by the second author). For the Eyring-Kramers formula see Berglund 
(2011) for mathematicians or Hanggi, Talkner, & Borkovec (1990) for physicists. 

§10.3. Spontaneous symmetry breaking in quantum spin systems 

The translation-non-invariant ground states mentioned after Proposition 10.5 are 
discussed e.g. in Example 6.2.56 in Bratteli & Robinson (1997). See also Liu & 
Emch (2005), which was in important source for this section, and Ruetsche (2011) 
for a discussion of the definition of SSB through non-implementability. For order 
parameters see e.g. Sewell (2002), §3.3. A proof of Proposition 10.8 may be found 
in Bratteli & Robinson (1997), Proposition 6.2.15. 

§10.4. Spontaneous symmetry breaking for short-range forces 

The idea of SSB goes back to Heisenberg(1928). The C*-algebraic approach in 
quantum spin systems with short-range forces is reviewed in Bratteli & Robin¬ 
son (1997); see also Nachtergaele (2007). Theorem 10.10 is due to Araki (1974); 
see also Simon (1993), Theorem IV.5.6, and Bratteli & Robinson (1997), Theorem 
6.2.18. In Definition 10.9, Araki required £2(o to be separating for nco{A)" instead of 
(0 to be a^-invariant, but in the presence of (10.53) and hence (10.53) these condi¬ 
tions are equivalent. The fact that (for short-range forces) global Gibbs states defined 
by (10.43) satisfy the KMS condition follows from Theorem 10.10, but this was the 
starting point of Haag, Hugenholtz, & Winnink (1967); see Winnink (1972). 

Uniqueness of KMS states for one-dimensional quantum spin systems with short- 
range forces at any positive temperature (which also holds for the classical case, e.g. 
the one-dimensional Ising model) has been proved by Araki (1975). See also Mattis 
(1965) and Altland & Simons (2010) for some of the underlying physical intuition. 

§10.5. Ground state(s) of the quantum Ising chain 

Theorem 10.11.1 was first established in Pfeuty (1970) by explicit calculation, 
based on Lieb, Schultz, & Mattis (1961). For more information on the quantum Ising 
model (also in higher dimension) see e.g. Karevski (2006), Sachdev (2011), Suzuki 
et al (2013), and Dutta et al (2015). Uniqueness of the ground state of the quantum 
Ising model with B ^0 holds in any dimension d, as first shown by Campanino, 





432 


10 Spontaneous Symmetry Breaking 


Klein & Perez (1991) on the basis of Perron-Frobenius type arguments similar to 
those for Schrodinger operators. The singular case B = 0 leads to a violation of the 
strict positivity conditions necessary to apply the Perron-Frobenius Theorem, and 
this case indeed features a degenerate ground state even when N <°°. 

The overall picture of SSB described in this section arose from the work of Horsch 
& von der Linden (1988), Kaplan, Horsch, & von der Linden (1989), Kaplan, von 
der Linden, & Horsch (1990), and especially Koma & Tasaki (1993, 1994). See also 
van Wezel (2007, 2008), van Wezel & van den Brink (2007), and Fraser (2016). 

The analogy between the quantum Ising chain and the double-well potential may 
not be surprising physically, since the latter was originally derived from the former: 
in potassium dihydrogen phosphate, i.e. KH 2 PO 4 , each proton of the hydrogen bond 
would reside in one of the two minima of an effective double-well potential origi¬ 
nating in the oxygen atoms, if it were not for tunneling, parametrized by the field B, 
which at small values yields a symmetric ground state (De Gennes, 1963). 

§10.6. Exact solution of the quantum Ising chain: N <°° 

The general set-up to this solution is due to Lieb, Schultz, & Mattis (1961), and 
was adapted to the quantum Ising by Pfeuty (1970), with further details by Karevski 
(2006). The complex solution qo was already noted by Lieb et al. The energy split¬ 
ting in higher dimensions does not seem to be known, but Koma & Tasaki (1994, 
eq. (1.5)) expect similar behaviour as in c/ = 1. 

§10.7. Exact solution of the quantum Ising chain: N = °° 

The solution described in this section is due to Araki & Matsui (1985), where 
further details may be found; this is a highlight of modern mathematical physics! 
Theorem 10.20 is due to Araki (1987), although such results have a long history 
going back to Shale & Stinespring (1964, 1965). For a very clear exposition see 
Ruijsenaars (1987). See also Evans & Kawahigashi (1998), Chapter 6 . 

The reason the one-sided chain A = N is problematic is that although the bosonic 
algebra C)^gf^M 2 (C) and its fermionic counterpart CAR(f^(N)) are well defined, and 
are isomorphic through the Jordan-Wigner transformation (10.102) - (10.103), the 
limiting dynamics has no simple form on either A or F, because the Fourier trans¬ 
form of f^(N) is the Hardy space n) of L^-functions with positive Fourier 

coefficents, instead of the usual L?{—n,n). Unlike on L^, The energies sgn^^. of the 
fermionic quasiparticles do not define a multiplication operator on H^. 

§10.8. Spontaneous symmetry breaking in mean-field theories 

The Poisson structure on S{B) was introduced by Bona (1988) and more gen¬ 
erally by Duffield & Werner (1992a); see also Bona (2000). Theorem 10.22 and 
Corollary 10.23 are due to Duffield & Werner (1992a). The symplectic leaves of the 
given Poisson structure on S{B) (for which notion see e.g. Marsden & Ratiu (1994) 
or Landsman (1998a)) were determined by Duffield & Werner (1992a): Two states 
p and (7 lie in the same symplectic leaf of S^{B) iff p{a) = a{uau*) for some uni¬ 
tary M G B. If p and a are pure, this is the case iff the GNS-representations Kp{B) 
and 7ta{B) are unitarily equivalent, cf. Thm. 10.2.6 in Kadison & Ringrose (1986). 
In general the implication holds only in one direction: if p and a lie in the same 
leaf, then they have unitarily equivalent GNS-representations. 
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Our survey of equilibrium states of homogeneous mean-field models is based on 
Fannes, Spohn, & Verbeure (1980) and Bona (1989). For rigorous results on the 
Curie-Weiss model see Chayes et al (2008) and Ioffe & Levit (2013). Numerical 
evidence for the restoration of Butterfield’s Principle may be found in Botet, Julien 
& Pfeuty (1982) and Botet & Julien (1982), which are up to N ~ 150, and Vidal et 
al (2004), which reaches N = 1000. Note that experimental samples have N < 10. 

In the context of the BCS model of superconductivity in the strong coupling 
limit), the Hamiltonian, hg in (10.287) or hto in (10.294) is called the Bogoliubov- 
Haag Hamiltonian, after Bogoliubov (1958) and Haag (1962). Further contribu¬ 
tions to mean-field theories include Thirring & Wehrl (1967), Thirring (1968), Hepp 
(1972), Hepp & Lieb (1973), van Hemmen (1978), Rieckers (1984), Morchio & 
Strocchi (1987), Duffner & Rieckers (1988), Bona (1988, 1989, 2000), Unnerstall 
(1990a, 1990b), and Sewell (2002). For a nice proof of Theorem 10.25, which orig¬ 
inates in Fannes, Spohn, &Verbeure (1980) and Bona (1989), see Gerisch (1993). 

Even in the absence of a global KMS condition for one is justified in in¬ 
terpreting the primary states (cUg)” as pure thermodynamic phases of the given 
infinite quantum system, whose thermodynamics is described by the “phase space” 
5(M„(C)). Though somewhat against the spirit of Bohrification (according to which 
the commutative C*-algebra C(M„(C)) is the right one to look at), the argument 
can be strengthened by enlarging A to A 0 C(M„(C)) (where the choice of the ten¬ 
sor product does not matter, since C(M„(C)) is commutative and hence nuclear, see 
§C.13). This larger C*-algebra was introduced by Bona (1990), who proved: 

Theorem 10.30. 1. There is a unique time-evolution a on A(g)C{M„(C)) such that 
for any primary permutation-invariant state CO on A and a GA one (strongly) has 

lim Tta = 7toy(cCt(a)). (10.397) 

2. The states and COq in (10.286), which are defined on A, extend to the tensor 

product A 0 C (Mfi (C)) as d)P 0 jJ-fi and (Og 0 dg, respectively, and as such satisfy 
the KMS condition at inverse temperature fi with respect to the dynamics a. 

§10.9. The Goldstone Theorem 

There is a large amount of literature on the Goldstone Theorem, both heuris¬ 
tic and rigorous. The former started with Goldstone, Salam, & Weinberg (1962), 
whereas the latter originates in Kastler, Robinson, & Swieca (1966); see also Buch- 
holz et al (1992). For a survey, see Strocchi (2008, 2012), whose approach (based on 
Morchio (& Strocchi, 1987) we follow. See also Berzi (1979, 1981), Landau, Perez, 
& Wreszinski (1981), Fannes, Pule, & Verbeure (1982), and Wreszinski (1987). 
§10.10. The Higgs mechanism 

The original reference is Higgs (1964ab). Our discussion is based on Lusanna 
(& Valtancoli (1996ab) and Struyve (2011), both of whom derive the physical vari¬ 
ables in the abelian Higgs model. See also Rubakov (2002), Strocchi (2008), where 
Theorem 10.29 may be found, and Stoltzner (2014) for some history and sociology. 
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Chapter 11 

The measurement problem 


The measurement problem of quantum mechanics was probably born in 1926; 

‘Thus Schrodinger’s quantum mechanics gives a very definite answer to the question of the 
outcome of a collision; however, this does not involve any causal relationship. One obtains 
no answer to the question “what is the state after the collision,” but only to the question 
“how probable is a specific outcome of the collision” (in which the quantum-mechanical 
law of [conservation of] energy must of course be satisfied). This raises the entire problem 
of determinism. From the standpoint of our quantum mechanics, there is no quantity that 
could causally establish the outcome of a collision in each individual case; however, so far 
we are not aware of any experimental clue to the effect that there are internal properties of 
atoms that enforce some particular outcome. Should we hope to discover such properties 
that determine individual outcomes later (perhaps phases of the internal atomic motions)? 

Or should we believe that the agreement between theory and experiment concerning our in¬ 
ability to give conditions for a causal course of events is some pre-established harmony that 
is based on the non-existence of such conditions? I myself tend to relinquish determinism in 
the atomic world. But this is [also] a philosophical question, for which physical arguments 
alone are not decisive.’ (Bom, 1926a, p. 866; translation by the author) 

In other words, quantum mechanics stipulates that the state after some collision (or 
measurement) is xj/ = Y.nCnWn, whereas experiment demonstrates that in fact the fi¬ 
nal state is just one of the \j/„, with (Born) probability |c„p. Quantum mechanics, 
then, seems unable to account for single outcomes of experiments and has to satisfy 
physicists with merely probabilistic predictions. This, in a nutshell, is the measure¬ 
ment problem—although very substantial analysis is needed to flesh it out. 

Giving up determinism was soon incorporated in the Copenhagen Interpretation 
of Bohr and Heisenberg (cf. the Introduction) and more broadly became part of 
what might be called “orthodoxy”, which represents the apparent (but not actual) 
consensus among Bohr, Heisenberg, Pauli, Born, Jordan, Dirac, von Neumann, and 
many others, which they supposedly reached around 1930 after the formal com¬ 
pletion of quantum mechanics. This “orthodoxy”, which later gave rise to the un¬ 
fortunate “shut up and calculate” attitude most physicists seem to have (especially 
towards the measurement problem), should be distinguished from the Copenhagen 
Interpretation. For example, von Neumann never endorsed the doctrine of classical 
concepts, which in the above attitude has been replaced by the different and far more 
superficial idea that it is the entire goal of physics to explain experiments. 
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11 The measurement problem 


11.1 The rise of orthodoxy 


Even within the strict Copenhagen Interpretation, there were sharp differences be¬ 
tween Bohr and Heisenberg, beyond the one concerning classical concepts reviewed 
in the Introduction. However, it seems that they agreed about the following point 
made by Bohr in his Como lecture concerning measurement: 

‘According to the quantum theory, just the impossibility of neglecting the interaction with 
the agency of measurement means that every observation introduces a new uncontrollable 
element.’ (Bohr, 1928, p. 584) 

This placed measurement squarely outside quantum mechanics for the second time: 
the first time was in the insistence that the measurement device (“if it is to serve 
its purpose”) had to be described classically (cf. the Introduction), and now we also 
learn that the interaction between the quantum object undergoing measurement and 
the apparatus in question is “uncontrollable”, despite the fact that Bohr and Heisen¬ 
berg regarded quantum mechanics as a complete theory, their argument was ap¬ 
parently that precisely the classical nature of the apparatus makes the interaction 
uncontrollable. This in turn justified the classical description of the device, in that 
registration of a measurement result ought to be “objective”, so that reading it out 
by performing a measurement on the apparatus, so to speak, should not introduce 
any further disturbance and hence uncontrollability (or so the argument goes). 

Consistent with Bohr’s point, a more detailed conceptual analysis of the measure¬ 
ment process was given by Heisenberg (1958, pp. 46^7, 54—55), who consistently 
refers to the quantum state or wave-function as the “probability function”: 

‘Therefore, the theoretical interpretation of an experiment requires three distinct steps: 

1. the translation of the initial experimental situation into a probability function; 

2. the following up of this function in the course of time; 

3. the statement of a new measurement to be made of the system, the result of which can 
then be calculated from the probability function. 

(...) After [the] interaction [with the measuring device] has taken place, the probability 
function contains the objective element of tendency and the subjective element of incom¬ 
plete knowledge, even if it has been a “pure case” before [i.e., it has become a mixture]. 

It is for this reason that the result of the observation cannot generally be predicted with 
certainty; what can be predicted is the probability of a certain result of the observation, 
and this statement about the probability can be checked by repeating the experiment many 
times. (...) The observation itself [i.e., the act of registration of the result by the mind of the 
observer] changes the probability function discontinuously; it selects of all possible events 
the actual one that has taken place. Since through the observation our knowledge of the sys¬ 
tem has changed discontinuously, its mathematical representation also has undergone the 
discontinuous change and we speak of a “quantum jump.” 

Here we find the typical Copenhagen view of measurement as a two-step process: 

1. Measurement turns an initial pure state (of the measured object) into a mixture; 

2. One term in this mixture is singled out (by Nature and thence by the observer). 

Note that Heisenberg’s last comment puts him squarely into the camp of what is 
now called “QBism” (i.e.. Quantum Bayesianism, see §11.2 below)! 
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Von Neumann (1932, §VI.l) gave a more formal (and highly influential) presen¬ 
tation of the (alleged) two stages of the measurement process: 


Tn the discussion so far we have treated the relation of quantum mechanics to the various 
causal and statistical methods of describing nature. In the course of this we found a pe¬ 
culiar dual nature of the quantum mechanical procedure which could not be satisfactorily 
explained. Namely, we found that on the one hand a state (j) is transformed into the state <)' 
under the action of an energy operator H in the time interval 0 < T < t: 


d 

dt 


— 


iTti ,, 

— : 0 < r < f 

h 


so if we write (j)o = (j), (j), = (j)' then (j)' = e which is purely causal. A mixture U is 

correspondingly transformed into 


U' = c-t»Hu,+ ¥'H 

Therefore, as a consequence of the causal change of <j> into ip' the [pure] states U = 
[=|p)(p|] go over into the [pure] states U' = (process 2 in V.I.). On the other hand, 
the state (p —which may measure a quantity with discrete spectrum, distinct eigenvalues and 
eigenfunctions ipi, ip 2 ,... —undergoes in a measurement a non-causal change in which each 
of the states (pi, ip 2 , .. .can result, and in fact does result with the respective probabilities 
\{(p,ip 2 )f',.... That is, the mixture 


n=l 

obtains (...) (process 1 in V.I.). Since the [pure] states [i.e. Pj^j] go over into mixtures, 
the process is not causal. The difference between these two processes U i-> U' is a very 
fundamental one: aside from their different behaviors in regard to the principle of causality, 
they are also different in that the former is (thermodynamically) reversible, while the latter 
is not’ (pp. 417-418 in von Neumann (1955); translation: R.T. Beyer) 

All this concerns merely the first stage of the measurement, in which a pure state 
is transformed into a mixed one. The second stage, in which a single outcome is 
obtained, is already alluded to above (though clouded by von Neumann’s ensemble 
language), but is described (in prose) later on through what is now called a von Neu¬ 
mann chain: one redefines system plus apparatus as the system, and couples it to a 
new apparatus, etc. This chain supposedly ends with the “ego” of the “individual” 
whose “intellectual inner life” is finally responsible for a single outcome. 

It is very remarkable that von Neumann nowhere seems to use the central Copen¬ 
hagen dogma that the apparatus be described classically (cf. the Introduction), espe¬ 
cially since the mathematics of operator algebras he was inventing at almost exactly 
the same time is tailor-made for incorporating this dogma (which fact indeed forms 
the motivation for the present book). One clue for his lack of enthusiasm may come 
from the very end of his book (i.e., §VI.3), where he challenges ‘an explanation 
often proposed to account for the statistical character of the process 1’, namely the 
idea that (the non-unitary) process 1 might have its origin in an initial mixed state of 
the apparatus. Indeed, even if the apparatus as a quantum-mechanical system is in a 
pure state (as any system should be ontologically), its description as a classical sys¬ 
tem generally renders its state mixed—and the same conclusion may be drawn on 
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epistemic grounds, arguing that the state of macroscopic or otherwise complicated 
systems cannot be known exactly. Many writings by the Copenhagen school, then, 
suggest that the alleged unanalyzable nature of the measurement and the random¬ 
ness of its outcome should be attributed to the classical description of the apparatus 
and its ensuing mixed state, including our earlier quotation (cf. §8.4) from Heisen¬ 
berg (1958) on the origin of probabilities in quantum mechanics: 

‘these uncertainties (...) are simply a consequence of the fact that we describe the experi¬ 
ment in terms of classical physics’ (Heisenberg, 1958, p. 53) 

To counter this argument, von Neumann argues that physics requires the (Born) 
probabilities for the various outcomes to depend only on the initial state (j) of the 
quantum system undergoing measurement (as opposed to the state of the apparatus, 
be it classical or quantum), whereas any “process 2” (i.e. unitary) time evolution 
would merely push the coefficients w„ in the (alleged) mixed apparatus state into the 
role of probabilities for the possible outcomes. However, ‘the are characteristic 
of the observer alone (and therefore independent of (j))’, and hence 

‘the non-causal nature of the process 1. is not produced by any incomplete knowledge of 
the state of the observer.’ (von Neumann, 1955, p. 439). 

Von Neumann’s argument became the mother of all “insolubility theorems” for the 
measurement problem, some of which will be reviewed in §11.3 below. 

Pauli (1933, §9) also includes some comments on measurement and the interpre¬ 
tation of quantum mechanics in general. These display a bizarre hybrid between the 
ideas of Bohr and von Neumann, somehow mediated by Heisenberg. Thus Pauli en¬ 
dorses (even starts with) some notion of Complementarity, but he relates this to the 
mathematical formalism rather than to the doctrine of classical concepts (which he 
nowhere invokes). Similarly, his treatment of measurement on the one hand follows 
the disturbance ideology of Bohr and Heisenberg (but without grounding this in the 
classical description of the apparatus), whilst technically he quotes and follows von 
Neumann, claiming that measurement leads to mixtures which subsequently reduce 
to one term through ‘ein besonderer, naturgesetzUch nicht im Voraus determinierter 
Akf (i.e., special process that does not follow deterministic laws of nature). A rather 
more systematic review of early measurement theory was written by London & 
Bauer (1939), whose opening is highly promising and almost poetic: 

‘The majority of introductions to quantum mechanics follow a rather dogmatic path from 
the moment that they reach the statistical interpretation of the theory. In general they are 
content to show, by more or less intuitive considerations, how the actual measuring devices 
always introduce an element of indeterminism, as this interpretation demands. However, 
care is rarely taken to verify explicitly that the formalism of the theory, applied to that 
special process which constitutes the measurement, truly implies a transition of the system 
under study to a state of affairs less fully determined than before. A certain uneasiness 
arises. One does not see exactly with what right and up to what point one may, in spite of 
this loss of determinism, attribute to the system an appropriate state of its own. Physicists 
are to some extent sleepwalkers, who try to avoid such issues and try to concentrate on 
concrete problems. But it is exactly these questions of principle which nevertheless interest 
nonphysicists and all who wish to understand what modem physics says about the analysis 
of the act of observation itself’ (London & Bauer, 1939, pp. 218-219) 
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Yet the authors mainly repeat von Neumann’s analysis (confirming its lofty status): 

‘The interaction with the apparatus does not put the object into a new pure state. Alone, 
it does not confer to the object a new wave function. On the contrary, it actually gives 
nothing but a statistical mixture: It leads to one mixture for the object and one mixture for 
the apparatus. For either system regarded individually there results uncertainty, incomplete 
knowledge. Yet nothing prevents our reducing this uncertainty by further observation. 

And this is our opportunity. So far we have only coupled one apparatus with one object. 

But a coupling even with a measuring device is not yet a measurement. A measurement is 
achieved only when the position of the pointer has been observed. It is precisely the increase 
of knowledge, acquired by the observation, that gives the observer the right to choose among 
the different components of the mixture predicted by the theory, to reject those which are not 
observed, and to attribute thenceforth to the object a new wave function, that of the pure case 
which he has found. We note the essential role played by the consciousness of the observer 
in this transition from the mixture to the pure state. Without his effective intervention, one 
would never obtain a new Xj/ function.’ (ibid., p. 251) 

Accordingly, at the end of the golden era of quantum mechanics, the view of mea¬ 
surement as a two-stage process in which a pure state is first transformed into a mix¬ 
ture in a more or less scientific way, upon which unanalyzable and possibly mental 
phenomena bring about a single outcome, was firmly established, although—the 
point deserves to be repeated—in their formal treatments neither von Neumann nor 
London & Bauer incorporated the key claim Bohr and Heisenberg made about mea¬ 
surement, namely that the corresponding apparatus must be described classically. 

Opponents of the Copenhagen Interpretation (the most prominent among whom 
were Einstein and Schrodinger) were well aware of this tension between formalism 
and ideology, which in the form of Schrodinger’s Cat even reached immortality (!): 

‘One may also construct highly burlesque cases. A cat is confined in a box of steel together 
with the following hellish machine (which one should secure against a direct attack by the 
cat): A Geiger counter contains a tiny amount of radioactive material, so little that during 
one hour possibly one of its atoms decays, but equally likely also none does; if it does, then 
the counter is triggered and activates, via a relais, a little hammer which breaks a small 
container of hydrocyanic acid. Having left this system to itself for one hour, one will say 
that the cat is still alive if meanwhile no atom has decayed. The first decay of an atom 
would have poisoned her. The yr-function of the entire system would express this in such 
a way that in it the living and the dead cat would be mixed or spread out on equal terms. 
What is typical about these cases is that an uncertainty which is originally limited to the 
atomic domain has been transformed into a coarse-grained uncertainty, which may then 
be decided by direct observation. This prevents us from regarding a “faded model” as an 
image of reality in such a naive way. As such [this model] contains nothing that is unclear 
or contradictory. There is a difference between a moved or poorly focused photograph and 
a record of clouds and fog banks.’ (Schrodinger, 1935, p. 812; translation by the author) 

The last sentence is particularly powerful, contrasting Schrodinger’s (as well as Ein¬ 
stein’s) view that physics should describe some sharply defined reality (of which 
quantum mechanics at best produces blurred pictures) with the Copenhagen view, 
according to which reality itself lacks focus (with quantum mechanics providing the 
best possible picture of it). This contrast confirms our idea that Schrodinger’s Cat 
metaphor specifically draws attention to the problems that arise from the Copen¬ 
hagen “duality postulate” that macroscopic systems (such as measurement devices 
and cats) admit both a classical and a quantum-mechanical description. 
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11.2 The rise of modernity: Swiss approach and Decoherence 

Despite Schrodinger’s Cat, the measurement problem was not an active field of re¬ 
search until Wigner (1963) rekindled interest in the topic. Even so, his paper mainly 
reiterated von Neumann’s views—which already had been repeated by London and 
Bauer—including his omission of the doctrine of classical concepts. In particular, it 
continued to promulgate the suggestion that measurement is a two-step process for 
which the clarification of the first step (i.e. of turning a pure state into a mixture) 
would already be a major part of the solution of the measurement problem. 

Wigner’s paper inspired for example the ‘“Swiss” approach to the measurement 
problem, which was remarkable in being the first serious mathematical attempt to 
take into account the Bohr-Heisenberg dogma that the apparatus be described classi¬ 
cally, whilst also paying tribute to von Neumann in insisting on mathematical rigour. 
Indeed, the Swiss approach relies on the formalism of operator algebras, which also 
marks a conceptual break with all earlier—and indeed most later—approaches in 
taking the observables rather than the states as a starting point. The aim of the Swiss 
approach is to show that relative to a suitable class of observables, the pure state 

P = \¥){¥\, W = T.C„\j/n, 

coincides with the corresponding mixture without the off-diagonal terms, i.e., 

n 

Thus the ambition of this approach is limited, in that no attempt is made to explain 
(at least the appearance of) single outcomes, except by appealing to the ignorance 
interpretation of probability (in vain, see below). The alleged equivalence between 
pure states and mixtures can typically be achieved if the apparatus is infinite and 
the measurement time is infinite, too. The infinite character of the apparatus (here 
seen as an idealization of a macroscopic device, as is standard in quantum statistical 
mechanics), is no guarantee for its classicality, but it is certainly a step in the right 
direction (cf. Chapter 8). Thus two closely related problems must be overcome; 

1. In its reliance on superselection sectors (technically, on disjoint states on a suit¬ 
able algebra of observables of the apparatus, see Definition 8.18), the program 
only works in the limit of infinite apparatus and infinite measurement time. In¬ 
deed, any approximation ruins the equivalence between pure states and mixtures; 
and hence even this limited solution to the problem violates Barman’s Principle. 

2. In so far as the subsequent problem of obtaining single outcomes to measurement 
is recognized in the Swiss approach at all, it seems to be addressed by an appeal to 
the ignorance interpretation of probability. Despite the fact that the mathematical 
situation in this respect is better than in ordinary quantum mechanics (where 
the ignorance interpretation of the formal probability distribution given by the 
coefficients in a diagonal density operator is nonsensical, if only because the 
state space is not a simplex), there is still no valid argument for this move. 
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To explain the last point, we quote Leggett (though somewhat out of context): 

‘Now, following Schrodinger, let us consider a thought experiment in which the quantum- 
mechanical description of the final state, as obtained by appropriate solution of the time de¬ 
pendent Schrdinger equation, contains simultaneously nonzero probability amplitudes for 
two or more states of the universe that are, by some reasonable criterion, macroscopically 
distinct (in Schrodingers example, this would be “cat alive” and “cat dead”). Of course, just 
about everyone, including me, would accept that because of, inter alia, the effects of deco¬ 
herence, it is likely to be impossible, at least for the foreseeable future, to experimentally 
demonstrate the interference of such states. (On the other hand, as the late John Bell was 
fond of pointing out, the foreseeable future is not a very well-defined concept. In fact, as 
late as 1999, not a few people were confidently arguing that because of the inevitable ef¬ 
fects of decoherence, the projected experiments to demonstrate interference at the level of 
flux qubits would never work. In this case, the foreseeable future lasted approximately one 
year. As Bell used to emphasize, the answers to fundamental interpretive questions should 
not depend on the accident of what is or is not currently technologically feasible.) But the 
crucial point is that the formalism of quantum mechanics itself has changed not one whit 
between the microscopic and macroscopic levels. Are we then entitled to embrace, at the 
macrolevel, an interpretation that was forbidden at the microlevel, simply because the ev¬ 
idence against it is no longer available? I would argue very strongly that we are not, and 
would therefore draw the conclusion: also at the macrolevel, when the quantum-mechanical 
description assigns simultaneously nonzero [probabilities] to two or more macroscopically 
distinct possibilities, then it is not the case that each system of the relevant ensemble realizes 
either one possibility or the other.’ (Leggett, in Schlosshauer, 2011, p. 155) 

This argument of Leggett’s (which is a special case of Barman’s Principle) was orig¬ 
inally targeted at decoherence, but it also applies verbatim to the Swiss approach 
(which is closely related to decoherence, as both heavily rely on limits and super¬ 
selection rules—which are absolute in the former and dynamically induced in the 
latter). In an even earlier hunch of Barman’s Principle, Bell— this time aiming di¬ 
rectly at the Swiss approach—in fact made a related point about its reliance on the 
f oo limit (in that even at extremely large but finite time the state remains pure). 

Jumping to the modern era, a striking point of continuity with the 1920s and 
1930s is the idea that the measurement procedure (and hence the measurement prob¬ 
lem) consists of two stages; only the terminology and the scope have changed: 

‘There are two distinct measurement problems in quantum mechanics: what Pitowsky has 
called a “big” measurement problem and a “small” measurement problem. The “big” mea¬ 
surement problem is the problem of explaining how measurements can have definite out¬ 
comes, given the unitary dynamics of the theory: it is the problem of explaining how in¬ 
dividual measurement outcomes come about dynamically. The “small” measurement prob¬ 
lem is the problem of accounting for our familiar experience of a classical, or Boolean, 
macroworld, given the non-Boolean character of the underlying quantum event space: it is 
the problem of explaining the dynamical emergence of an effectively classical probability.’ 
(Bub, in Schlosshauer, 2011, pp. 145-146) 

Clearly, the “small” measurement problem is modern parlance for the problem 
how to turn a superposition into a mixture, upon which the “big” problem—if it is 
noticed at all—still concerns the old issue of selecting one term from this mixture. 

Burthermore, the measurement problem seems to have acquired increased scope 
and importance, as exemplified by the following quotations: 
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‘One of the most ancient philosophical questions (Heidegger thought is was the question) is 
this: why is there something rather than nothing? In terms of events rather than substances, 
the question would be: how come anything happens at all? That question is the measurement 
problem.’ (Fine, in Schlosshauer, 2011, p. 146) 

‘The measurement problem has been called “the reality problem” by Philip Pearle. This is 
a better name for it. We perceive objects in the world as being in definite states. A door 
is either open or shut, a given ball either is in a given box or it is not. The wave function, 
however, can have superpositions of these things, suggesting that the door can be simultane¬ 
ously open and shut at the same time, and that the ball can be both in the box and not in the 
box at the same time. The reality problem is that there is a discrepancy between the version 
of reality we perceive, and the version presented to us by the most obvious interpretation of 
the wave function.’ (Hardy, in Schlosshauer, 2011, p. 153) 

‘Fundamentally, the measurement problem is the problem of connecting probability with 
truth in the quantum world, that is to say, it is the problem of how to relate quantum probabil¬ 
ities to the objective occurrence and non-occurrence of events. The problem arises because 
there appears to be a difficulty in reconciling the objectivity of a particular measurement 
outcome with the entangled state at the end of a measurement.’ (Bub, ibid., p. 145) 

More technically, the measurement problem has come to be seen as a special case 
of the problem of explaining at least the appearance of the classical world from 
quantum theory. If the measurement problem is seen from the Copenhagen perspec¬ 
tive this is eminently reasonable, as both problems involve the dual description of 
either the apparatus or the world around us as both classical and quantum (and its 
possible failure). In this context, an alleged solution to the “small” problem, such as 
Decoherence, is often also seen as this explanation (as if there were no issue about 
the derivation of the laws of classical physics, including the dynamical ones). 

A propos, another characteristic feature of the modern era is undoubtedly the 
dominance of Decoherence (if only over the Swiss approach), for example: 

‘I think the whole discussion about whether measurements in quantum mechanics are in¬ 
deed problematic somewhat misses the point. Measurement interactions are only one of 
many examples of quantum interactions that lead to superpositions of macroscopically dis¬ 
tinct states. Nature has been producing macroscopic superpositions for millions of years, 
well before any quantum physicist cared to artificially engineer such a situation. The key 
concept here is decoherence. Environmental interactions tend to produce superpositions of 
classically distinct states. This raises the issue of how one could describe a classical regime 
in quantum mechanics, quite irrespective of the existence of measuring apparatuses. (...) 

If decoherence and its applications had been developed early in the history of quantum 
theory, then the idea that measurements play a special role in the theory might not have 
risen to such prominence, and the foundations of quantum mechanics would have focused 
instead on the problem of how to derive a classical regime within the theory.’ 

(Bacciagaluppi, in Schlosshauer, 2011, p. 143) 

Mathematically, decoherence boils down to the idea of adding one more link to 
the von Neumann chain (see §11.1) beyond 5-FA (i.e. the system and the apparatus). 
Conceptually, however, there is a fundamental conceptual as well as technical dif¬ 
ference between Decoherence and older approaches that took such a step: whereas 
previously (e.g., in the hands of von Neumann, London & Bauer, and Wigner) the 
chain converged towards the observer, in Decoherence it diverges away from the 
observer. Namely, the third and final link is now taken to be the environment. 
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This notion is often taken in a fairly literal sense in agreement with the intuitive 
meaning of the word, but it may also (we would even say: preferably) refer to inter¬ 
nal degrees of freedom of the apparatus, as in the Spehner-Haake model in §11.4. 
Either way, the “environment” is usually treated as an infinite system (necessitating 
a limit like N °°), which (in simple models where the pointer has discrete spec¬ 
trum) has the consequence that the post-measurement state (in 

which the Xn are mutually orthogonal) is only reached not only in the limit N ^ 
of infinitely many degrees of freedom but also in the limit f —oo of infinite time. In 
that case, the restriction of the above state to S +A (i.e. the trace of the corresponding 
density operator over the degrees of freedom of the environment) is mixed, which 
means that the quantum-mechanical interference between the states y/n 0 0n for dif¬ 
ferent values of n has become “delocalized” to the environment, and accordingly is 
deemed irrelevant if the latter is not observed (i.e. omitted from the description). 

Unfortunately, in so far as it claims to provide a solution to the measurement 
problem. Decoherence is an unmitigated disaster: 

1. Decoherence actually aggravates the measurement problem: where previously 
this problem was believed to be man-made and relevant only to rather unusual 
laboratory situations, it has now become clear that “measurement” of a quantum 
system by the environment (instead of by an experimental physicist) happens 
everywhere and all the time: hence it remains even more miraculous than before 
that there is a single outcome after each such measurement. 

2. Even the need for one of the two limits N ^ or t ^ makes Decoherence 
vulnerable to Earman’s Principle; see Bell’s and Leggett’s critiques above. 

3. Like the Swiss approach. Decoherence suffers from the difficulty that even if it 
were able to reach its goal of reducing pure states to mixtures (about which ability 
one may have doubts), there is no sound follow-up step to solve the next problem 
of selecting one term from the mixture produced in the previous step. The igno¬ 
rance interpretation seems blocked by Leggett’s argument quoted above (i.e. his 
continuity argument to the effect that Decoherence just removes the evidence for 
a given Schrodinger’s cat state to be a superposition, elsewhere charging those 
claiming that Decoherence solves the measurement problem of committing the 
logical fallacy that removal of the evidence for a crime would undo the crime). 

Thus Decoherence is parasitic on some interpretation of quantum mechanics that 
solves the measurement problem, which in turn is typically strengthened by it. In 
this context, the most popular of these has been the Everett (i.e.. Many-Worlds) 
Interpretation, which, after decades of obscurity or even derision, suddenly started to 
be greeted with a flourish of trumpets in the wake of the popularity of Decoherence. 
However, even if such extravagant interpretations are coherent, these should in our 
opinion be a very last resort, acceptable only if truly everything else has failed. 

On the positive side. Decoherence has led to the important idea of einselection 
(for environment-induced superselection), where a pure state \j/ of some system 
(possibly plus apparatus) is “einselected” if it remains pure after coupling to the 
environment and subsequent restriction. The hope (or rather program), then, is to 
show that classical states are classical precisely because they are robust in this way. 
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Finally, it may be appropriate to close this historical introduction to the measure¬ 
ment problem by mentioning another modern approach, namely outright denial: 

T remember giving a talk at a meeting at the London School of Economics seven or so years 
ago. In the audience was an Oxford philosophy professor, and I suppose he didn’t much like 
my brash cowboy dismissal of a good bit of his life’s work. When the question session came 
around, he took me to task with the most proper and polite scorn I had ever heard (I guess 
that’s what they do). “Excuse me. You seem to have made an important point in your talk, 
and I want to make sure that I have not misunderstood anything. Are you saying that you 
have solved the measurement problem? This problem that has plagued quantum mechanics 
for seventy-live years? The message of your talk is that, using quantum information theory, 
you have finally solved it?’’ (Funny the way the words could be put together as a question, 
but have no intended usage but as a statement.) I don’t know that I did anything but turn the 
screw on him a bit further, but I remember my answer. “No, not me; I havent done anything. 
What I am saying is that a “measurement problem” never existed in the first place. (...) 

The “measurement problem” is purely an artefact of a wrong-headed view of what quan¬ 
tum states and/or quantum probabilities ought to be. (...) quantum states are not real things 
from a Quantum Bayesian view (...) but a personal judgment, a quantified degree of belief. 

A quantum state is a set of numbers an agent uses to guide the gambles he might take on 
the consequences of his potential interactions with a quantum system. It has no more sub¬ 
stantiality than that. Aren’t epistemic states real things? Well .. .yes, in a way. They are as 
real as the people who hold them. But no one would consider a person to be a property of 
the quantum system he happens to be contemplating. And one shouldn’t think of a quantum 
state in that way either—one shouldnt think of it as a property of the quantum system to 
which it is assigned. Take the source of the paradox away, we say, and the paradox itself 
will go away.’ (Fuchs, in Schlosshauer, 2011, pp. 146-147) 

These words have been quoted at some length, because the view that “physics is 
information” and its alleged corollary that all foundational problems are solved by 
Bayesian reasoning (perhaps with a quantum flavour) is becoming increasingly pop¬ 
ular. Physicist are now seen as punters (or, in academic parlance, “agents”) who 
in smoky offices bet on the outcomes of experiments, and hence use (quantum) 
Dutch Book arguments to justify some sort of strictly epistemic (quantum) proba¬ 
bility calculus. However, the ideology of “QBism” thus expressed appears to have 
adopted precisely the weakest ingredients of the Copenhagen Interpretation—viz. 
the idea that the wave-function is just a catalogue of the probabilities for possible 
outcomes of measurements whose details are supposedly beyond our grasp, cf. the 
Introduction—at the expense of its one strong component, namely the doctrine of 
classical concepts. Although there may have been pragmatic reasons for this atti¬ 
tude in the 1920s, (mathematical) physics has moved forward since then, enabling 
much more detailed analysis and hence justifying considerably greater ambition in 
understanding the measurement process than Bohr and Heisenberg cum suis had. 

In any case, the fact that one competent author regards the measurement problem 
as the key to reality whilst another flatly denies even its very existence should give 
pause for thought. As in the Bohr-Einstein debate, different perspectives on reality 
and on the task of physics seem to play a role here, culminating in contrasting views 
of quantum-mechanical states; the more “reality” one attributes to states, the more 
serious the measurement problem is. Or, contrapositively, the more operationalist 
one’s attitude, the further the problem disappears behind the horizon. 
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11.3 Insolubility theorems 

Since in §11.4 we will “propose the impossible”, namely miraculously solving the 
measurement problem within unitary quantum mechanics, it is helpful to review the 
arguments why this is generally felt to be impossible. Such arguments take the form 
of so-called insolubility theorems. As already mentioned, such theorems ultimately 
go back to von Neumann: especially those that prove the impossibility of explaining 
his process 1 (i.e. the transition from a pure state to a mixture) from process 2 
(unitary time evolution according to the Schrodinger equation). Another kind of 
insolubility theorem shows that single outcomes are impossible from process 2. 

It might be argued that both kinds of theorem add little to the basic mathematical 
intuition behind the measurement problem, which is as follows (it goes without 
saying that we disagree with this traditional description of measurement, see below). 
Let s G B{Hs) be the observable being measured (where Hs is some Hilbert space 
associated to a quantum object S undergoing measurement) and let a G B{Ha) be 
a “pointer observable” correlated to S (where is a second Hilbert space). In 
particular, the measurement apparatus A is described quantum mechanically. For the 
moment we assume both Hilbert spaces to be finite-dimensional and both operators 
to be non-degenerate, even having the same spectrum this of course 

implies that dim(i/ 5 ) = dim(//^) = n. Thus Hs has a basis 0 of eigenvectors of 
s and likewise Ha has a basis of eigenvectors of a, with svj'''^^ = and 

(i = 1,... ,n). The (erroneous) argument, then, is as follows: 

1. Measurement should establish a correlation between values of s of 5 and values 
of a of A, which with the above labeling implies that for each i the initial sys¬ 
tem state should push the pointer from some initial state into a final 
(post-measurement) state . Hence the dynamics, described by some unitary 
operator u G B{Hs^Ha), should be such that 

u(v-''^ 0 if/o) = (g) vl‘‘^ = (pi- (1 LI) 

2. If the initial system state is (with |c,p = 1), then, by linearity 

of M, the final state is ^ = T^iCiCPi- But if A is sufficiently macroscopic this con¬ 
flicts with observation, which always shows one of the terms in the sum. In other 
words, in theory, a —more precisely, 1//^ (g)fl—has no value in this state, whereas 
in practice it does, since in the real world measurements do have outcomes. 

3. Hence the final state should be the mixed density operator \ci\^\(pi){(pi\ (rather 
than the pure one |^)(^|), whose ignorance interpretation (allegedly) yields one 
of the states (pi with probability |c,p. But it is impossible to transform the initial 
pure state |^o)(^o| into the above mixture by any unitary operator, let alone by 
the M defined by (11.1), which by construction yields 

u\Vo^){¥o'’y = l^)(^l 
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As we already discussed, for some authors the measurement problem is the clash be¬ 
tween nos. 1 and 3 (this is the “small” problem), whereas for others it is the conflict 
between nos. 1 and 2 (i.e. the “big” one). Either way, the goal of insolubility theo¬ 
rems is to show that the problem is not a consequence of idealizations in primitive 
arguments like the one just given, but remains even under very general assumptions. 
In particular, both the purity of the initial system as well as apparatus states (and 
hence of their tensor product), and the exact system-apparatus correlation assumed 
(including the premise of point spectra and finite-dimensional Hilbert spaces), can 
be considerably relaxed. To illustrate the kind of discussion, we present one example 
of an insolubility proof along the former lines and one along the latter. These proofs 
even remain valid if the notion of an observable itself is relaxed, too, namely from 
a self-adjoint operator to a POVM (see (2.178)), but we will not discuss this utmost 
generality (if only because it would not circumvent our critique below). It should be 
noted that insolubility theorems tacitly assume that the mathematical objects in the 
quantum-mechanical formalism describe all there is physically. 

In the first direction, we have Theorem 11.2 below, which we may summarize as 
the problem of statistics: there is a contradiction between the following postulates: 

1. System and apparatus are both described quantum-mechanicaUy. 

2. The wave-function of the system is complete. 

3. The wave-function always evolves linearly (e.g., by the Schrodinger equation). 

4. Measurements with identical initial wave-functions may have different out¬ 
comes, and the probability of each possible outcome is given by the Bom rule. 

Here the second and third postulates may be consequences of the first, but even so 
it is useful to list them separately, since denying or circumventing nos. 1, 2, and 3 is 
typically done in completely different ways (see the end of this section). 

Formally, let s = s* G B(Hs) be an arbitrary self-adjoint operator on an arbitrary 
(separable) Hilbert space Hs, with associated spectral projections G 
A c a{s), and likewise a G B{Ha). It is convenient (and entails no genuine loss of 
generality) to still assume that a{s) = (j{a). Recall that the Born measure PpJ on 
the spectrum ajs) induced by some density operator ps G &{Hs) is given by 

= Tr (pse^^) = (Os = p^(A), (11.3) 

cf. (4.9), where (Os is the state associated to ps by (2.33), and no notational confusion 
between Pp^ and should arise (they are the same thing). Likewise for a. 

Definition 11.1. 1. Let H be a Hilbert space and let b G B(//)sa- Two (normal) 
states ( 0 ,C 0 ' on B(H) are called b-distinguishable ifPa ^ f ; in other words, 

there is some A (Z G{b) such that p^^ (A ) ^ p^^) (A ). Similarly for p,p' G St{H). 
2. In the situation described before (11.3), a pair (pa,u), where Pa is a density 
operator on B(Ha) and u is a unitary operator on Hs ®Ha, is a measurement 
scheme/or s if s-distinguishability of two density operators ps, p's on Hs implies 
-distinguishability of the two states u{ps ® Pa) u* and u{p'^ ® Pa)u*. 
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3. A measurement scheme {pa,u) for s preserves probabilities if for any density 
operator pg G ^{Hs) the probability measure on (7{a) = <j{Ihs ® induced by 

u{ps® Pa)u* equals the Born measure on C7(s) = C7(a) induced by py. 

4. A density operator p G 0 Ha) objectifies the pointer observable a relative 

to some countable partition <j{a) = of its spectrum if p = T^iPiOvi, where 
each unit vector Vi G Hs^Ha is an eigenvector of 1//^ ® e^^^ ( pi > 0, Y.iPi = IJ- 

For example, in case of a discrete spectrumf or simplicity, if Ai f X 2 in cy{b), then 
any two unit eigenvectors vj'^^ (i = 1 , 2 ) give rise to /t-distinguishable vector states 

Pi = |. If y/ = +C 2 V 2 ^^ with |cip + |c 2 p = 1 and c\ f 0,1, then 

also the trio (pi,p 2 ,ey/) is pairwise /t-distinguishable. If, the other hand, X G <y{b) 
is degenerate, then Cxf/ and e^i fail to ^-distinguishable whenever xj/, xj/' G Hx- 
Clause 2 of Definition 11.1—which incorporates a vast number of at least theo¬ 
retical scenario’s—is a considerable weakening of the scheme ( 11 . 1 ), while clause 
3 sharpens the second, implying that measurement transfers all Born probabilities 
for the object to the apparatus, probabilistically making the latter a mirror image 
of the former. Clause 4 firstly takes care of continuous spectra; if a{a) is discrete, 
one may simply partition it by its points (a partition of a{a) is sometimes called 
a reading scale). The “objectification” terminology is questionable (if not outright 
misleading), as it is motivated by the ignorance interpretation of mixtures (see be¬ 
low), but we follow the literature in using it. In what follows, we exclude the trivial 
cases where (7(s) consist of a single point, and/or a (a) is partitioned by itself. 

Theorem 11.2. For any nontrivial object observable s and partitioning of G{a), 
there exists no measurement scheme {pa,u) for s whose final state u{ps® Pa)^* ob¬ 
jectifies a for any initial system state ps (let alone one that preserves probabilities). 

Proof. Since we will not use this theorem (except for pointing out that it attacks a 
straw man), we just prove it in the special case where a{a) is discrete and parti¬ 
tioned by its points, and also the spectral decomposition Pa = T,nPn^n of the initial 
apparatus state is unique, cf. (B.490). For any unit vector in G Hs we then have 

®Pa)u* = Y^Pnu{e^(s) ®en)u*. (11.4) 

n 

Take Xi f X 2 in (j(s), with associated eigenvectors and If e„ = |a„)(a„|, 
for unit vectors a„ GHa, then objectification of a requires that each of the vectors 

u(\)2^ ®an), 

with |ci p -f |c 2 p = 1 and ci 7 ^ 0,1, must be an eigenvector of 1//^ ®a. This is only 
possible if the first two vectors (and hence the third) lie in the same eigenspace for 
lHg®a, but in that case condition no. 2 in Definition 11.1 is violated, since the three 
given initial system states are pairwise s-distinguishable whereas the corresponding 
outcomes states just listed evidently fail to be 1 //^ ® a-distinguishable. □ 
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Insolubility theorems of the second kind describe the problem of outcomes, ac¬ 
cording to which clauses L, 2., and 3. of the problem of statistics also contradict: 

4’. Measurements have determinate outcomes. 

Technical statements to this effect are even more straightforward than those for¬ 
malizing the problem of statistics. We keep Hg and s G B{Hs) as they were, but this 
time, Ha may refer to the rest of the Universe outside the quantum object described 
by Hs (which includes the pointer, of course). Here is the key assumption. 

Definition 11.3. Let s G B(//s)sa be an object observable with partition (7(s) = 
\_\i(ziAi of its spectrum (if cy{s) = {Ai,...} is discrete, one may take Ai = {A,}), 
and let be a second Hilbert space. A sound measurement scheme consists of: 

• A collection {Si)iei o/ outcome spaces, i.e. subsets of the (normal) state space, 

Si(lSn{Hs®HA) = &{Hs®HA), (11.5) 

for which there is 0 < Tj < 1/2 such that for i ^ j, one has 

2\/1-77 < II®,- - ®,-|| < 2 (coi G Si,0)j G Sj). (11.6) 

• A pair (pa,u), where Pa is a density operator on B(Ha) and u is a unitary on 
Hs®Ha, such that for each i G I and each unit vector G Ha, (i.e., eA^vj'^^ = 

the state u(e^(s) ®Pa)u* (i.e. the outcome of the measurement) lies in Si. 

In (11.6) the first bound (which for small 77 is « (2 — t]) < • • •) is the key one, as 
the last one < 2 is always satisfied and has been included for clarity. In particular, 

||®,--®;|| > V2. (11.7) 

Note that (11.6) implies that the 5, must be disjoint, since assuming co G Si gives 
11® ~ ^/ll > 2v^l — t] for all cOj G Sj, whereas co G Sj allows one to take (Oj — CO 
in this inequality, leading to the contradiction 0 > 2v^l — rj. Note that in terms of 
density operators we have 


ll®<-®.7il = Ilp/-Pilli, (11-8) 

where ®,(fl) = Tr(p,a), cf. (B.481) and Theorem B.146. If ®, and cOj are pure, 
induced by unit vectors y/, and xj/j in Hs®Ha, then by (C.637), eq. (11.6) comes 
down to 

0<\{WGW.i)?<B- (11-9) 

For example, in the von Neumann measurement scheme (11.1), the subspace 5, just 
consist of the vector state defined by vj'^^ 0 hence ( 11 . 6 ) holds with rj —0. 

Theorem 11.4. For any nontrivial object observable s and partitioning of (7{s), any 
sound measurement scheme {{Si),r\,pA,u) admits initial states V G Hg such that 
m(cu ®Pa)u* (i.e. the post-measurement state) does not lie in any outcome space Si. 
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Proof. Let u = (t;, + Vj)l'/2, where i j and for the moment u, and Vj are merely 
orthonormal vectors in //y. For each i= 1,2 we then compute: 

= Ikn — eujli 
= 2^l-\{v,Vi)\^ 

= V2, (11.10) 

where || • || denotes the trace norm relative to H. Now take u,- = as in Defini¬ 
tion 11.3. Since w,- = u{e^(s) ® Pa)u* G Si by definition of a sound measurement, it 

follows from (11.7) and (11.10) that co = M(eu ®Pa)u* cannot lie in any subspace 
Sk, since that would require ||a) — a);|| > for all I fk, whereas (11.10) shows 
that this inequality fails for at least two values of I, viz. I = i and I = j □ 

In order to circumvent Theorems 11.2 and 11.4, one should deny at least one of 
their explicit premises. Moreover, we note that postulate no. 3 (i.e. linearity of time- 
evolution) is always implicitly used in the form of the following counterfactual: 

If xj/n were the initial state, then/or each n it would evolve (linearly) according 
to the Schrodinger equation with given Hamiltonian h. If the initial state were 
LnCnV^n, also then it would evolve according to the same Hamiltonian h. 

This counterfactual should be added as a tacit assumption to all insolubility proofs 
(and also to informal statements of the measurement problem). As such, it may 
reasonably be denied (see § 11.4), and such a denial puts assumption no. 4 in the 
problem of statistics in perspective, namely by denying the possibility that identical 
initial states can always be prepared in such a way that they evolve through exactly 
the same Hamiltonian. This leaves room for the following denials of some premise: 

-■ 1. The apparatus is not described quantum-mechanically; 

-■ 2. The wave-function of the system is not complete; 

-■ 3. The wave-function does not always evolve by the Schrodinger equation; 

-■ 4. Identical initial wave-functions always yield identical outcomes; 

-■ 4’. Measurements do not have determinate outcomes. 

Current programs for solving the measurement problem neatly fall into this scheme: 

-■ 1. Copenhagen Interpretation and Swiss Approach; 

-■ 2. Hidden-variable theories, most prominently Bohmian mechanics; 

-■ 3. Dynamical collapse theories (such as GRW); 

-■ 4. Instability approaches, e.g., the Flea on Schrodinger’s Cat (which keeps 3); 
-■ 4’. Many-Worlds Interpretation, i.e., Everettian quantum mechanics. 

Leaving most of these to the literature, we now turn to the instability approach (-■4). 
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11.4 The Flea on Schrodinger’s Cat 

The conclusion of this lengthy historical and technical introduction is that there are 
(at least) two different formulations of the measurement problem, whose insolubility 
is expressed by Theorems 11.2 and 11.4, respectively (leaving apart lavish opportu¬ 
nities for disagreement about the precise formulation of the underlying assumptions, 
and not even speaking about the outright dismissal of the whole issue as a Schein- 
problem). Thus the problem in question is evidently of a different kind from say the 
famous open conjectures in mathematics (like the Riemann hypothesis), where it is 
clear what the theorem is that needs to be proved. Nonetheless, despite its undeni¬ 
able philosophical aspects, we see the measurement problem as a genuine physics 
problem concerned with the discrepancy between (quantum) theory and experiment, 
to be addressed by mathematical, physical, and philosophical analysis. 

Well aware that different people typically draw different lessons from history, 
we will now, in the interest of motivating our approach to follow, draw our own 
(necessarily subjective) conclusions from the history of the measurement problem. 

1. Though grounded in genius and tradition (Heisenberg, von Neumann, Wigner), 
the two-step way of looking at the measurement process (i.e. in terms of firstly 
a reduction of the wave-function by some non-unitary “process 1” and secondly 
a registration of a single outcome), with ensuing separation of the measurement 
problem into a “small” and a “big” problem, is fruitless and should be abandoned. 
It has no basis whatsoever in experimental physics (where the alleged mixed 
post-measurement states are conspicuously absent), it reflects obsolete ensemble 
thinking, and it is unsound also theoretically, as shown both by the first kind of 
insolubility results (a la von Neumann and Theorem 11.2), as well as by the fail¬ 
ure of programs addressing just the “small” problem (like the Swiss approach 
and Decoherence). These approaches are unable to deal with the “big” problem 
(except perhaps through desperate remedies like Many Worlds) and hence, even 
if they work, they deliver Pyrrhic victories at best. The problem of obtaining sin¬ 
gle outcomes should be solved directly, before it is too late. Since such a solution 
would leave nothing to interfere, the “small” problem automatically disappears. 
This does not mean that it is sufficient to obtain definite outcomes alone; among 
all remaining challenges, deriving the Born rule stands out in particular. 

2. Too much formal analysis has been done on the measurement problem (including 
the insolubility theorems just reviewed) without taking the special nature of mea¬ 
surement devices into account; alas, this negligence has its roots in the work of 
von Neumann. These devices are typically treated as ordinary quantum systems, 
as a consequence of which the notion of an “outcome” has to be defined within 
quantum mechanics and hence has to be identified e.g. with an eigenstate of some 
operator describing the apparatus (as in Theorem 11.2) or with some subspace of 
the quantum-mechanical state space (as in Theorem 11.4). Such identifications 
are purely formal and have little basis in experimental physics: as long as one 
defines outcomes of measurements within quantum mechanics, there is no mea¬ 
surement problem (but at worst some unease concerning value indefiniteness)! 
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Fig. 11.1 The waves crashed between the towering cliff of Scylla and the jagged rocks of Charyb- 
dis. Colour litograph by Gino D'Antonio. Reprinted with permission from Look and Learn Ltd. 


On the other hand, both the Copenhagen Interpretation and the Swiss approach 
seem to have gone too far in the opposite direction: the former because it simply 
assumed (without providing any justihcation) that measurements have outcomes 
as soon as the apparatus is described classically, the latter in treating apparatuses 
as strictly inhnite, and hence falling victim to Barman’s Principle. The right ap¬ 
proach, then, must be to dehne measurement as in the Copenhagen Interpreta¬ 
tion, i.e. using a classical description of the apparatus whilst realizing it is on- 
tologically a quantum system, and thusly navigate between Scylla (who treats 
measurement devices as arbitrary quantum systems) and Charybdis (who is too 
enthusiastic in taking inhnite limits and hence in using a classical description). 
3. Some kind of reality has to be attributed to the state of the system (though this 
reality cannot be “absolute”, as in classical physics). In the algebraic approach 
to quantum theory adopted throughout the present book, the starting point is pro¬ 
vided by the observables, relative to which states are dehned. Since the doctrine 
of classical concepts drives us to switch between quantum-mechanical and clas¬ 
sical descriptions, the reality of the quantum state is therefore perspectival. How¬ 
ever, their perspectival nature does not make states less real; they say everything 
there is to say (at least by quantum theory) about some given level of description 
(which may be said to be chosen by the observer, and hence is intersubjective). 

Thus the measurement problem arises in the way Schrodinger (rather than von Neu¬ 
mann) described it, although a precise framework has to be added to his poetry. 

A framework that is precise both conceptually and mathematically is offered by 
asymptotic emergence, which we already encountered in our discussion of SSB in 
the previous chapter (see especially its preamble). To repeat the main points, we 
speak of asymptotic emergence if the following three conditions are all satished: 
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1. A “higher-level theory” H (which in the context of the measurement problem is 
either classical mechanics or classical thermodynamics, depending on the mea¬ 
surement setup) is a limiting case of some “lower-level theory” L (viz. quantum 
mechanics, including quantum statistical mechanics of a finite system). 

2. Theory H is well defined and understood by itself (typically predating L). 

3. Theory H has “emergent” features that cannot be explained by L, e.g. because L 
does not have any property inducing those feature(s) in the limit pertinent to H. 

The root of the measurement problem (and hence the relevance of asymptotic emer¬ 
gence), then, lies in Bohr’s requirement that the outcomes of measurements on sys¬ 
tems defined within L be recorded in (at the least the language of) H, so that, cru¬ 
cially, measurement according to L is a notion external to L (if only partly), in par¬ 
ticular involving the relationship between L and H. None of the insolubility proofs 
of the measurement problem take this into account (although due to Butterfield’s 
Principle these proofs remain relevant in a secondary way). The typical feature of 
H that would be emergent in the above sense if the measurement problem were un¬ 
resolved is that every physical system subject to the theory H is ontologically in 
a pure state; in Schrodinger’s words quoted in §11.1: in H, sharply focused pho¬ 
tographs of states are always possible (and hence any uncertainty or chance is due 
to ignorance, as in classical physics). Now, whatever the ontological nature of states 
in L, the states they induce in H should be real in the above sense, i.e., pure. But 
this is precisely what does not seem to be the case in typical measurement situations 
(e.g., Schrodinger’s Cat), where the post-measurement state on L induces a mixed 
state on H. Just as in the case of SSB, this violates Butterfield’s Principle, which in 
the case at hand states that since H is an idealization of L, any physical effect in H 
must be foreshadowed in L: as L approaches H, sharp measurement outcomes (de¬ 
fined as pure states in H) must arise from at least approximate single measurement 
outcomes (i.e. “singly-peaked wave-functions”) in the relevant asymptotic regime 
o/L (since only these wave-functions gives rise to pure classical states on H). 

As noted before in the setting of SSB: violating Butterfield’s Principle means 
violating Barman’s Principle, which in turn leads to a violation of the link between 
theory and reality. It is worth spelling this out for the measurement problem: 

• Reality is described by quantum mechanics (even in the Copenhagen Interpreta¬ 
tion, classical mechanics is an idealization of quantum mechanics); 

• Real phenomena—in this case, sharp measurement outcomes— are correctly de¬ 
scribed by classical mechanics although this is an idealization-, 

• Quantum mechanics (allegedly) cannot possibly induce these phenomena in its 
limit towards classical mechanics although it is the theory that should apply, 

• Hence quantum mechanics contradicts reality. Classical mechanics does not con¬ 
tradict the reality of sharp measurement outcomes, but it is not the appropriate 
theory to explain them; this explanation should come from quantum mechanics. 

It may now seem that invoking Butterfield’s Principle has reduced the measure¬ 
ment problem to the usual one(s) described in the preceding sections. But look at 
the small print: in the Copenhagen Interpretation, single measurement outcomes 
only appear in some limiting “classical” regime of quantum mechanics. 
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“Deep inside” quantum mechanics, there is no need at all for the typical superpo¬ 
sition c„ to collapse into one of the states xj/^ (unless one conflates the physical 
measurement problem with the philosophical problem of value indeflniteness). The 
external and asymptotic nature of measurement outcomes causes the measurement 
problem, but, as we shall see, at the same time it provides the key for its solution, 
since the collapse mechanism we propose is only effective asymptotically (so that it 
operates where it should and does not act where it should not). More precisely, by 
taking into account perturbations of the Hamiltonian that are tiny and ineffective in 
the quantum regime, but become hugely destabilizing in the classical regime (even 
before the actual limit), the wave-function of the apparatus will collapse. 

Summarizing the preceding discussion, “our” measurement problem states that: 

• Certain pure post-measurement states of an (ontologically quantum-mechanical!) 
apparatus coupled to a microscopic quantum object induce mixed states on the 
apparatus (and on the composite) once the apparatus is described classically. 

This is a precise version of Schrodinger’s Cat problem (rather than von Neumann’s 
purely quantum-mechanical measurement problem), making it clear that at heart the 
problem does not lie with the (dis)appearance of interference terms (which is a red 
herring) but with the inability of quantum mechanics to predict single outcomes. 

We now show by means of a simple example what it means to describe an on¬ 
tologically quantum-mechanical apparatus classically, and outline the scenario we 
envisage for the solution of the measurement problem on the basis of this example. 
The Spehner-Haake model of the apparatus described below is too simple to be 
realistic, but nonetheless it may serve its purpose (as Bohr would say). The model 
involves a double-well potential like (10.11), modified however by a little basin in 
the middle, as shown below (including ground states for one large and one small 
value of h). Also here, SSB will play a crucial role, so please recall §10.1. 




Fig. 11.2 Double-well potential with basin; ground state Yn=o 5 v4=o Ol 
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11 The measurement problem 


Consider + 1 non-interacting particles, each with mass m, moving on 

the real line under the influence of a one-particle potential V (note that although 
the zero’th particle with be handled lightly differently from the others, it is not the 
pointer!). In terms of the canonical coordinates (p^qO = {po, ■ ■ ■ ,Pn,(}o, ■ ■ G 
on the phase space X = T*M.^ the classical Hamiltonian is 

Mp',q')= E (11-11) 

Now perform a canonical transformation to center of mass and relative coordinates 


P = 


N 

E p»' 

n'=0 


1 ^ 


1 ^ 

Ttn = Vl^Pn - —7= E Pn’ 

n '=0 


Pn 


^i^n-qo) {n 


( 11 . 12 ) 


,N); (11.13) 


the center of mass (P, Q) will be the pointer. The inverse transformation is given by 


P 1 

Pn - 




a E ^n, 

71=1 

(11.14) 


(11.15) 

N 

Ep«; 

(11.16) 

(11.17) 


Granted that {pn',qk'} = ^n'k'^ {Pn'iPk'} = = 0^ we then duly have 

{P, Q} = 1 and {7t„,Pk} = 5„k, with all other elementary Poisson brackets vanishing. 
In terms of the new coordinates, the classical Hamiltonian (11.11) reads 


h{P,Q,7t,p)=hA{P,Q)+hAE{Q,p)+hE{7t), (11.18) 

where n — {ni,... ,nN), p = (pi,..., p^v), and the three partial Hamiltonians are 


hAiP,Q) = ^+N'V{Qy, 


] I N ( N 


N 

E 

71=1 


N 

E 

^77=1 


hAE{Q,p) = tl^fkip)y^'HQ), 


k=\ 


(11.19) 

( 11 . 20 ) 

( 11 . 21 ) 
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where M = Nm is the total mass of the system, for simplicity we assumed V to be 
analytic (it will even be taken to be polynomial), and we abbreviated 

/ , TV \* A' / 1 ^ 

A(P)- +E . ( 11 . 22 , 

Note that /i (p) = 0, so that to lowest order (i.e. k = 2) we have 

hAE(Q,p)=nNfp^-fp,Pi)v"(Q) + --- (11.23) 

\ n=l k^l j 

We pass to the corresponding quantum-mechanical Hamiltonians in the usual way, 
and couple a two-level quantum system to the apparatus through the Hamiltonian 

= P • o’a 0 P, (11.24) 

where the object observable s = ( 73 , acting on = C^, is to be measured. The idea 
is that is the Hamiltonian of a pointer that registers outcomes by localization on 
the real line, He is the (free) Hamiltonian of the “environment”, realized as the in¬ 
ternal degrees of the freedom of the total apparatus that are not used in recording 
the outcome of the measurement, and hpE describes the pointer-environment inter¬ 
action. The classical description of the apparatus then involves two approximations: 

• Ignoring all degrees of freedom except those of A, which classically are {P, Q); 

• Taking the classical limit of hp, here realized as N ^ (in lieu of 0). 

The measurement of s is now expected to unfold according to the following scenario: 

1. The apparatus is initially in a metastable state (this is a very common assump¬ 
tion), whose wave-function is e.g. a Gaussian centered at the origin. 

2. If the object state is “spin up”, i.e., y/j = (1,0), then it kicks the pointer to the 
right, where it comes to a standstill at the bottom of the double well. If spin is 
down, likewise to the left. If y/y = (1,1) / a/ 2, the pointer moves to a superposition 
of these, which is close to the ground state of V displayed in Figure 11.2. 

3. In the last case, the Flea mechanism of §10.2 comes into play: tiny asymmetric 
perturbations irrelevant for small N localize the ground state as N 

4. Mere localization of the ground state of the perturbed (apparatus) Hamiltonian in 
the classical regime is not enough: there should be a dynamical transition from 
the ground state of the original (unperturbed) Hamiltonian (which has become 
metastable upon perturbation) to the ground state of the perturbed one. This dy¬ 
namical mechanism in question should also recover the Born rule. 

Thus the classical description of the apparatus is at the same time the root of the 
measurement problem and the key to its solution: it creates the problem because at 
first sight a Schrddinger Cat state has the wrong classical limit (namely a mixture), 
but it also solves it, because precisely in the classical limit Cat states are destabilized 
even by the tiniest (asymmetric) perturbations and collapse to the “right” states. 
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The “flea” perturbation might itself be a genuine random process, perhaps ulti¬ 
mately of quantum origin. In that case, the measurement merely amplifies the ran¬ 
domness that was already inherent in the flea by transferring it to the apparatus. 

Alternatively, the flea might be fundamentally deterministic (though it may 
nonetheless be modeled stochastically for pragmatic reasons). In principle, this 
would open the door to a restoration of determinism: for the flea now transfers its 
determinism (rather than its randomness) to the apparatus. The mistaken impression 
that quantum theory implies the irreducible randomness of nature then arises be¬ 
cause although measurement outcomes are determined, they are unpredictable “for 
all practical purposes”, even in a way that (because of the exponential sensitivity to 
the flea in 1 jh or A) dwarfs the unpredictability of classical chaotic systems. 

Either way, the flea perturbation would naturally be different at each different run 
of an experiment under otherwise identical initial conditions, which motivates our 
critique of the counterfactual discussed after the proof of Theorem 11.4. 

The location of the flea plays a similar role to the position variable in Bohmian 
mechanics, i.e., it is essentially a hidden variable. Recall the notions of Outcome 
Independence (Ol) and Parameter Independence (Pl), reviewed in §6.5. Briefly, the 
conjunction of 01 and PI is equivalent to Bell’s locality condition, and if the latter 
is satisfied, then the Bell inequalities hold. Since these are violated by quantum 
mechanics, any hidden variable theory compatible with quantum mechanics must 
violate 01 or PI. Deterministic hidden variable theories necessarily satisfy OI, in 
which case Bell’s Theorem or the Free Will Theorem shows that they must violate 
PI in order to be compatible with quantum mechanics. A violation of PI leads to 
possible superluminal signaling only if the hidden variable z can be controlled. If 
the wave-function \j/ is regarded as the hidden variable, then quantum theory itself 
satisfies PI but violates OI (since \j/ can be prepared, the other way round would be 
disastrous). Qua deterministic hidden variable theory, Bohmian mechanics satisfies 
OI, and hence it violates PI; for the GRW collpase theory it is the other way round. 

The fate of the flea therefore depends on the nature of the perturbation: if it is 
deterministic, the theory behaves like Bohmian mechanics in this respect and hence 
violates PI, whereas stochastic perturbations typically violate OI (and possibly also 
pi). Either way, no conflict with the said theorems arises. Moreover, in the Colbeck- 
Renner Theorem, assumption CP fails for the flea scenario—assuming, in view of 
its limitation to finite-dimensional Hilbert spaces, the theorem is applicable at all! 
Besides such issues, others remain to be resolved, of which we just mention two: 

1. Collapse of the wave-function has become a tunneling process, whose static ef¬ 
fects are exponentially enhanced as N ^ (or h ^ 0, as in §10.2). However, 
tunneling times increase in the same way, so that the environment is needed not 
only to provide the perturbation, but also to speed up the dynamics of collapse. 

2. The flea not only destabilizes the Schrodinger Cat state (as desired), but also 
destabilizes the intended outcome states (like those in Si, cf. Theorem 11.4). Also 
here the environment should play a decisive role in (re)stabilizing the latter but 
not the former, possibly through the mechanism of einselection, cf. §11.2. 
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Notes 

§11.1. The rise of orthodoxy 

The literature on the measurement problem is vast. Apart from the annotated 
reprint volume Wheeler & Zurek (1983), relatively recent surveys of and books 
include Bell (1990b), Maudlin (1995), Busch, Lahti, & Mittelstaedt (1996), Bassi 
& Ghirardi (2003), Mittelstaedt (2004), Wallace (2012), Allahverdyan, Balian, & 
Nieuwenhuizen (2013), and Busch, Lahti, Pellonpaa, & Ylinen, (2016). In modal 
interpretations of quantum mechanics, the measurement problem is (dubiously) con¬ 
flated with the far milder problem of value indefiniteness, see e.g. Bub (1997). 

§11.2. The rise of modernity: Swiss approach and Decoherence 

The Swiss approach to the measurement problem was initiated by Jauch (1964), 
to be continued by e.g. Hepp (1972), Emch & Whitten-Wolfe (1976), and recently 
also by Hepp’s former student Frohlich; see e.g. Frohlich & Schubnel (2013) and 
Blanchard, Frohlich & Schubnel (2016). In addition, see Landsman (1991, 1995)— 
now seen as naive—, Breuer, Amann & Landsman (1993), and Sewell (2005). 

Key early papers on decoherence were Zeh (1970), Zurek (1981), and loos & Zeh 
(1985), and standard reviews are Zurek (2003), Joos et al (2003), and Schlosshauer 
(2007). Penetrating critiques include Janssen (2008) and Tanona (2013). See also 
Camilleri (2009a) and Freire (2009) for some history. 

A defence of QBism may be found in Caves, Fuchs, & Schack (2002b). 

§11.3. Insolubility theorems 

Insolubility theorems of the first kind kind go back to von Neumann (1932) and, 
in his wake, Wigner (1963) and Fine (1970). Theorem 11.2 is (in even more general 
form) due to Busch & Shimony (1996); with slightly different assumptions, the spe¬ 
cial case proved in the main text is due to Brown (1986). The monographs by Busch, 
Lahti, & Mittelstaedt (1996) and Mittelstaedt (2004) contain detailed discussions of 
theorems of this kind. See also Bacciagaluppi (2014). 

The formulation of the problem of statistics and the problem of outcomes is taken 
from Maudlin (1995). Theorem 11.4 is due to Bassi & Ghirardi (2003), although 
here it is presented in a form inspired by Griibl (2003). 

For Bohmian mechanics see e.g. Goldstein (2013) and Bricmont (2016). A recent 
review of the GRW program and related dynamical collapse theories is Bassi et al 
(2013). Nowadays, the locus classicus for Many Worlds is Wallace (2012). 

The time-evolution counterfactual discussed in the main text was inspired by the 
problem of free will, see the quotation of Dennett at the beginning of §6.3. 

S11.4. The Flea on Schrodinger’s Cat 

The approach to the measurement problem discussed here has its roots in Lands¬ 
man & Reuvers (2013) and Landsman (2013), whose model at the time only in¬ 
volved the apparatus. This was criticized in van Heugten & Wolters (2016), many 
of whose points may be addressed by turning to the Spehner-Haake model, in¬ 
troduced by Spehner (& Haake (2008). The ABN-model of Allahverdyan, Balian, & 
Nieuwenhuizen (2013) gives a similar picture; for a comparison see Spehner (2009). 





Chapter 12 

Topos theory and quantum logic 


The topos-theoretic approach to quantum mechanics (also known as quantum 
toposophy) has the same origin as the quantum logic programme initiated by 
Birkhoff and von Neumann, namely the feeling that classical logic is inappropri¬ 
ate for quantum theory and needs to be replaced by something else. For example, 
Schrodinger’s Cat serves as an “intuition pump” for this feeling (at least in the naive 
view—dispensed with in Chapter 11—that it is neither alive nor dead). However, 
we feel that the quantum logic proposed by Birkhoff and von Neumann is: 

• too radical in giving up distributivity (rendering it problematic to interpret the 
logical operations A and V as conjunction and disjunction, respectively); 

• not radical enough in keeping the law of excluded middle, which is precisely 
what intuition pumps like Schrodinger’s cat and the like challenge. 

Thus it would be preferable to have a quantum logic with exactly the opposite fea¬ 
tures, i.e., one that is distributive but drops the law of excluded middle: this suggest 
the use of intuitionistic logic. It is interesting to note that Birkhoff and von Neumann 
(who had earlier corresponded with Brouwer about possible intuitionistic aspects of 
game theory, notably chess) actually considered intuitionistic logic, but rejected it: 

‘The models for propositional calculi which have been considered in the preceding sections 
are also interesting from the standpoint of pure logic. Their nature is determined by quasi¬ 
physical and technical reasoning, different from the introspective and philosophical consid¬ 
erations which have had to guide logicians hitherto. Hence it is interesting to compare the 
modifications which they introduce into Boolean algebra, with those which logicians on “in- 
tuitionist” and related grounds have tried introducing. The main difference seems to be that 
whereas logicians have usually assumed that properties L71-L73 [i.e. {a')' = ci, nfla' = _L, 
a U fl' = T, and a <Zb implies a’ 7> b'] of negation were the ones least able to withstand a 
critical analysis, the study of mechanics points to the distributive identitiesas the weakest 
link in the algebra of logic. (...) Our conclusion agrees perhaps more with those critiques 
of logic, which find most objectionable the assumption that a' Ub = T implies a <Z b (or, 
dually, the assumption that ar\b' = ± implies b 3 a —the assumption that to deduce an 
absurdity from the conjunction of a and not b, justifies one in inferring that a implies b).' 
(Birkhoff & von Neumann, 1936, p. 837). 

As already made clear, then, our view is exactly the opposite. It is perhaps more 
striking that our position on (quantum) logic also differs from Bohr’s: 
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‘All departures from common language and ordinary logic are entirely avoided by reserving 
the word “phenomenon” solely for reference to unambiguously communicable information, 
in the account of which the word “measurement” is used in its plain meaning of standardized 
comparison.’ (Bohr, 1996, p. 393) 

Rather than postulate the logical structure of quantum mechanics, our goal is to 
derive it from our Bohrification ideology, more specifically, from the poset ^{A) 
of all unital commutative C*-subalgebras of a unital C*-algebra A, ordered by in¬ 
clusion. One may think of this poset as a mathematical home for Bohr’s notion of 
Complementarity, in that each C G ^{A) represents some classical or experimental 
context, which has been decoupled from the others, except for the inclusion rela¬ 
tions, which relate compatible experiments (in general there seem to be no preferred 
pairs of complementary subalgebras C,C' G l^iA) that jointly generate A, although 
Bohr typically seems to have had such pairs in mind, e.g. position and momentum). 

Quantum toposophy also accommodates the feeling that quantum mechanics is 
so radical that not just the actors of classical mechanics, but its whole stage must be 
replaced. This need is well expressed by the following quotation from Grothendieck, 
who created topos theory (but never witnessed its application to quantum theory): 

‘Passer de la mecanique de Newton a celle d’Einstein doit etre un peu, pour le mathematicien, 
comme de passer du bon vieux dialecte proven 9 al a 1’argot parisien dernier cri. Par centre, 
passer a la mecanique quantique, j’imagine, e’est passer du fran 9 ais au chinois.’ 

(Grothendieck, 1986, p. 61).^ 

Indeed, topos theory replaces even set theory, seen as the stage of classical math¬ 
ematics and physics, by some other stage: each topos provides a “universe of dis¬ 
course” in which to do mathematics. One major difference with set theory, then, is 
that logic in most toposes (including the ones we will use) is ... intuitionistic! 

This chapter presupposes familiarity with §C.l 1 on the logical side of the Gelfand 
isomorphism for commutative C*-algebras, Appendix D on lattice theory and logic, 
and Appendix E on topos theory. Since this material is off the beaten track, as in 
Chapter 6 it may be helpful to provide a very brief guided tour through this chapter. 

In §12.1 we first define the “quantum mechanical” topos T(A) that will act as 
the mathematical stage for the remainder of the chapter; it depends some given 
(unital) C*-aIgebra A only via the poset 'if (A). We then define C*-aIgebras inter¬ 
nal to any topos T (in which the natural numbers and hence the rationals can be 
defined), which notion we then apply to T = T(A), so as to define an internal C*- 
algebra A, which turns out to be commutative. Following an interlude on construc¬ 
tive Gelfand spectra in §12.2, in §12.3 we then compute the internal Gelfand spec¬ 
trum of A for A = M„(C), and derive our intuitionistic logic of quantum mechanics 
from this, given by eqs. (12.95) - (12.96) and (12.103) - (12.107). We also discuss its 
(Kripke) semantics. In §12.4 we generalize these computations to arbitrary (unital) 
C*-algebras A, culminating in Corollary 12.22. Finally, in §12.5 we relate this mate¬ 
rial to both the Kochen-Specker Theorem (which provided the original motivation 
for quantum toposophy), as well as to an attempt at ontology called “Daseinisation.” 

* ‘For a mathematician, switching from Newton’s mechanics to Einstein’s must to some extent 
be like switching from a good old provincial dialect to Paris slang. In contrast, I imagine that 
switching to quantum mechanics amounts to switching to Chinese.’ Translation by the author. 
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12.1 C*-algebras in a topos 

Let A be a unital C*-algebra (in Sets), with associated poset ‘^(A) of all unital com¬ 
mutative C*-subalgebras C C A ordered by inclusion. Regarding '^(A) as a (posetal) 
category, in which there is a unique arrow C D iff C C D and there are no other 
arrows, we obtain the topos T(A) of functors F : ‘^(A) —)■ Sets (F underlined!), i.e., 

T(A) = [‘r(A),Sets]. (12.1) 

Since for any poset X we have an isomorphism of categories [X,Sets] ~ Sh(2f), 
where X is endowed with the Alexandrov topology, see (E.84), we may alternatively 
write 

T(A) ~ Sh('^(A)). (12.2) 

This alternative description will turn out to be very useful in computing the Gelfand 
spectrum of the internal commutative C*-algebra A to be defined shortly. Since we 
occasionally switch between T(A) and the topos Sets, we underline objects (i.e., 
functors F : 'Tf(A) Sets) of the former. In order to do some kind of Analysis in 
T(A), we need real numbers. In many toposes this is a tricky concept, but: 

Proposition 12.1. In T(A), the Dedekind reals are given by the constant functor 

lo : C M, (12.3) 

where C G 'if (A), with associated frame given by the functor 

^(l)o:Ch^^((tC)xR). (12.4) 

Similarly, we have complex numbers C and their frame in T(A). 

Proof In a general sheaf topos Sh(A), the Dedekind real numbers object is the 
sheaf (E.150), with frame (E.149). The point now is that each continuous function 
/ G C('^(A),]R) on X = ^{A) with the Alexandrov topology is locally constant. 

To see this, suppose C < D inU, and take V C M open with /(C) G V. Then 
C G (y) and (V) is open by continuity of /. But the smallest open set con¬ 
taining C is fC, which contains D, so that f{D) G V. Taking V = {f(C) — £,°°) 
gives the inequality f(D) > f(C) — e for all e > 0, whence fiD) f(C), whereas 
y = (—,/(C) + e) yields f{D) < /(C). Hence /(C) = fiD). 

Thus we obtain (12.3) - (12.4) as special cases of (E.150) - (E.149). □ 

Other objects of interest in T(A) that we will steadily use are: 

• The terminal object 1, i.e., the constant functor C *, where * is a singleton. 

• The truth object ff which according to (E.86) - (E.87) is given by 

^o(C) =Upper(C); (12.5) 

^i(CCD) = (-)n(tD), (12.6) 
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where Upper(C) is the set of all upper sets above C (i.e., S € Upper(C) iff S C 
“^(A) such that: (i) C C D for each D gS, and (ii) D G S and D CE imply E G S). 
• The subobject classifier t : 1 —^ , which is a natural transformation whose com¬ 

ponents tc are given, according to (E.88), as 

fc(*)=tC, (12.7) 

i.e., the set of all D D C in “^(A); this is the maximal element of Upper(C). 
Furthermore, exponentials in T(A) have the following straightforward description: 

Ff (C) = Nat(G^c>£tc) G ^A)), (12.8) 

where is the restriction of the functor F : “^(A) Sets to f C C 'if (A), and 
Nat(—,—) denotes the set of natural transformations between the functors in ques¬ 
tion. In particular, since C -1 is the bottom element of the poset 'if (A), one has 

F2(C-l)=Nat(G,F). (12.9) 


One way to derive (12.8) is to start from general sheaf toposes Sh(2f), where 

FoG(G)=Nat(G|y,f^c;), (12.10) 

both restricted to ^(G) (i.e. defined on each open V QU instead of all V G 
and use (E.84). Combining these observations, one has 

^^(C)^Sub(E.,.c), (12.11) 

i.e., the set of subfunctors of In particular, like in (12.9), we find 

f2^(C-l)^Hom(E,f2)^Sub(E), (12.12) 

the set of subfunctors of E itself. Recall that, as explained after Lemma E.16, a 
subfunctor Z G Sub(E) is a functor Z : ^^(A) — >■ Sets for which Zq(C) C Fjq(C) for 
all C G 'if (A) and Zj is the restriction of Ej. If C C D, then the set-theoretic map 
(C) —7> Q—{D) defined by Q—, identified with a map Sub(E^I(-) -G- Sub(E|£,), is 
simply given by restricting a given subfunctor of E^^- to jD. 

Using either the internal language of a topos (see §E.5) or direct object-arrow 
constructions, one can copy standard definitions in set theory so as to define math¬ 
ematical objects “internal” to any given topos, as long as these definitions make 
sense in first-order intuitionistic logic (which roughly speaking means that they are 
“constructive”, in not using the axiom of choice or the law of the excluded middle). 

As a case in point, let us now define internal C*-algebras in T(A) (this may 
be done even more generally in any topos T in which at least the natural numbers 
N, and hence the rationals Q, are defined). Vector spaces (over K or © and (com¬ 
mutative) *-algebras may be defined in T(A) through straightforward object-arrow 
translations of the usual constructions in Sets, i.e., one has an object A and arrows: 
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•:CxA—^-A (scalar multiplication); (12.13) 

+ :AxA^A (addition); (12.14) 

x:AxA—>-4 (multiplication); (12.15) 

*:A—>^A (involution), (12.16) 


subject to the usual axioms. Syntactically, a unit (internal) in A is a constant 

1a : 1 —>■ A, 


with 1 the terminal object in T(A), such that 


IxA 


(lA.idA) 


A X A —>• A 



(12.17) 


The notions of norm and completeness are less easily defined internally, and 
hence one starts reinterpreting the notion of a seminorm in Sets as a subset 

A^cAxQ+, (12.18) 

for which 

GjViff ||fl|| <^. (12.19) 

In our topos T(A), we interpret Nc Ax as a subfunctor Ax Q+ (or, equiv¬ 
alently by A-conversion (E.153), as an arrow 1 -4 subject to the axioms: 

Vpp>0^(0 ,p)GiV; (12.20) 

3qq > 0 A {a,q) £ N', (12.21) 

V«Vp(a,p)GjV^(a*,p)GjV; (12.22) 

\ja'^q(.{a,q) G (Vo 3pp <qA{a,p) G jV); (12.23) 

VaVp((fl,p) £NA(b,q) £N^ (a + b,p + q) £ N); (12.24) 

VaVp {{a,p) £ NA {b,q) G jV —>■ {a-b,p-q) £ jV); (12.25) 

VflVpV4(a,p) £NA {\z\ <q)^ {z- a,p-q) £ N). (12.26) 

Here a,h are variables of type A, p and q are variables of type Q, z is a variable 
of type C, 0 is the zero constant in A, etc. For a unital *-algebra (whose internal 
definition we leave to the reader), with unit denoted by 1a as usual, we also require 

lhV«Vpp>l^(lA,p)GA(. (12.27) 

If the seminorm relation furthermore satisfies 

(a* ■ a,q^) £ N£A ia,q)£N (12.28) 

for all fl G A and q £ Q+, then A is said to be a pre-semi-C*-algebra. 
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To proceed to a C*-algebra, one requires a — 0 whenever {a,q) G N for all q in 
Q+, making the seminorm into a norm, and subsequently this normed space should 
be complete. The latter condition is quite complicated, since in a topos one has no 
Cauchy sequences in the usual sense, because A may not have global elements (in 
the sense of arrows 1 —A). Indeed, our algebra A defined below only has trivial 
global elements, namely multiples of the the unit operator. 

Hence one needs a generalization of Cauchy sequences in the general spirit of 
topos theory, where global elements are replaced by general elements. 

Definition 12.2. With N the natural numbers object in T(A) (which is simply the 
constant functor C i—>■ Nj, a Cauchy approximation in A is an arrow s : N —>■ QA 
(or, equivalently, by X-conversion (E.153), an arrow J : N x A —^ which in turn 
is the same as a subobject S o/N x ^ such that: 


V„3aflGs„; (12.29) 

> ni,n' > m,a G Sn,a' G s„/) —>■ (a — 1 /k) G N. (12.30) 

Here (for brevity) the first three comma’s (but not the last!) stand for A, and a G s„ 
denotes (n,a) G S, where S is the above subobject o/N x A classified by % (>ce use 
the notation explained in item 9 at the end of%E.5, where the variable x :X is now 
the pair (n,a) of type N x Moreover, a Cauchy approximation converges to b if: 

Vi:3mV„(n > m,fl G s„) -)> (a-b, l/k) GN, (12.31) 

and we call A complete if each Cauchy approximation in A converges. 

Finally, a C*-algebra in T(A) (and similarly in any topos with natural numbers) 
is a complete pre-semi-C*-algebra in which the semi-norm is a norm. 

Homomorphisms and isomorphisms between such (internal) C*-algebras may be 
defined in the usual way, bijections in set theory being replaced by isomorphisms 
of objects. We only consider internal C*-algebras with unit, so that we may define 
internal categories CAj (and CCA i) of (commutative) unital C*-algebras in T (A) in 
the obvious way (where the homomorphisms are required to preserve the unit). 

We now come to the basic construction that underlies “quantum toposophy”. 
Theorem 12.3. Let A be a unital C*-algebra. Define a functor A G T(A) by 

A :‘^(A) ^ Sets; (12.32) 

Ao(C)=C; (12.33) 

Ai(CCD) = (C-aD). (12.34) 

Then A is an internal unital commutative C*-algebra under pointwise operations. 

Here A is meant to be an “ordinary” unital C*-algebra, i.e., defined in Sets. Note that 
the symbol C in (12.33) changes character from left to right: on the left-hand side it 
is a point in ‘fi’{A), whereas on the right-hand side it is a subset of A. Nonetheless, 
one might describe A as the tautological functor in (A), Sets]. 
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The pointwise operations in A are the obvious natural transformations that are 
ultimately defined by the corresponding operations in each commutative C*-algebra 
C. For exampe, addition + :AxA—>^Aisa natural transformation with components 
+c : C xC ^ C defined in C, etc. Commutativity of A then trivially follows from 
commutativity of each commutative C*-subalgebra C. 

As already mentioned, the unit 1 a is syntactically a constant 1 a : 1 —A, whose 
components (1a)c : * —C are just the units Ic in each C (recall that elements of 
our poset ^{A) were defined as unital commutative C*-subalgebras of A!). 

Finally, we regard the (semi) norm iV as a subobject of A x R.^ (or A x Q^), 
hence as a natural transformation, with components C C x R+ defined by 

(c,^)GAc iff ||c|l<^, (12.35) 

where || • || is the norm in C (which of course is inherited from A). 

Proof. The proof is a straightforward verification, expect perhaps for completeness. 
First, the above subobject 5 of N x A, realized as a subfunctor as usual, looks as 
follows: for each C G ^{A) we have a subset C N x C, regarded as a sequence 
(C„) of subsets of C through the identification (n,c) G 5^ iff c G C„, such that C„ C 
Dn whenever C C D. Unfolding axiom (12.29) using the Kripke-Joyal semantics 
rules listed at the end of §E.5, we find that this axiom holds iff: 

^Cg'^{a) VmgN 3cgC Vddc C G (12.36) 

which is satisfied iff each of the above subsets C„ C C is non-empty. By a similar 
analysis, axiom (12.30) is satisfied iff for each e > 0 there is m G N such that for all 
n,n,> m and all c G C„, c' G C„/ one has ||c —c'|| < e in C. This simply means that 
any choice (c„) where c„ G C„ is a Cauchy sequence in C. Accordingly, A is complete 
provided each such sequence converges, i.e., iff each C G “^(A) is complete. Since 
these C’s are C*-subalgebras of C, this is simply true by construction. □ 

In a similar way, one easily proves the following generalization of Theorem 12.3: 

Theorem 12.4. Let Che a small category. Any internal C*-algebra in the associated 
presheaf topos [C°P,Sets] is given by a contravariant functor A : C —> CA, where 
CA is the category that has C*-algebras as objects and homomorphims as arrows. 
Moreover, A is unital/commutative iff each C*-algebra A(C) is unital/commutative. 

It should be mentioned that internal C*-algebras on sheaf toposes T = Sh(2f) are not 
covered by this theorem (except in the somewhat degenerate case we use, namely 
X = ‘rf{A) with the Alexandrov topology). As a case in point, we just mention the 
beautiful fact that internal C*-algebras in Sh(2f) correspond to continuous bundles 
of C*-algebras overX (in Sets). 
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12.2 The Gelfand spectrum in constructive mathematics 

In this chapter we rely on a particular construction of the frame (A)) (cf. §C.l 1) 
that can be generalized to topos theory (in which the Gelfand spectrum E{A) of an 
internal commutative C*-algebra A is a locale). We start with some lattice lore. 

Definition 12.5. Let L be a distributive lattice with top T and bottom _L. 

1. A lower set in L is a subset SQL such that if x G S and y < x, then y G S. We 
denote the poset of all lower subsets ofL, ordered by inclusion, by D(L). 

2. An ideal in a lattice L is a lower set I in L such that x,y G I implies xWy Gl. The 
poset all ideals in a lattice L, ordered by inclusion, is denoted by Idl(L). 

3. We say that x<Qy (in words: “x is well inside y” or “x is rather below y” ) iff 
there exists z such that xAz = -L and y Vz = T. Note that x <C y implies x <y, as 

X = xA (yVz) = (xAy) V (xAz) =xAy < y. (12.37) 

4. An ideal I G Idl(L) is regular if the condition / 3 {y G L | y <C x} implies x G 1. 
The poset of regular ideals in L, ordered by inclusion, is called Rldl(L), i.e., 

Rldl(L) = {/ G Idl(L) I (VyeLy <Cx^yG/)^xG/}. (12.38) 

The posets D(L), Idl(L) and Rldl(L) are easily seen to be frames. Any ideal I G 
Idl(L) can be regularized, i.e., turned into a regular ideal sW{I), by means of the 
restriction to Idl(L) C D(L) of the “closure” map sz/ : D(L) —D(L) defined by 


x/(/) = {x G L I VyGLy <Cx^y G/}. (12.39) 

In terms of the canonical map xi-A fx from L to Idl(L) “regularizes” to a map 

/:L^RIdl(L); (12.40) 

X I-A . (12.41) 

For I G Rldl(L) we obviously have (I) = I, and hence we may write 

Rldl(L) = {/ G Idl(L) I i^(/) = /}. (12.42) 


Definition 12.6. 1. A frame 0{fC) with top element T is called compact if every 
subset S G ff(X) with y S = T has a finite subset F G S with y F = T. 

2. A frame ff{X) is called regular if each V G ^(X) satisfies 

V = \/{U Gff{X)\U <S:V}. (12.43) 

When ^(X) is the topology of some space X, the frame ^(X) is compact (regular) 
iff X is compact (regular) as a space. Furthermore, X is compact and Hausdorff iff 
it is compact and regular, and hence the Gelfand spectrum Z(A) of a commutative 
unital C*-algebra A will be a compact and regular frame; see Theorem 12.8 below. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



12.2 The Gelfand spectrum in constructive mathematics 


467 


Recall that the self-adjoint part Asa of any C*-algebra A is partially ordered by 
putting a<b iff b — aG A+, cf. §C.7. This partial order is, of course, inherited by 
the positive cone A+ C Asa- If A is commutative, this partial ordering makes Asa a 
lattice; for example, if A = C{X) the lattice operations are aV b — max{a,b} and 
a Ab = min{a,b} (taken pointwise). In general, one may then compute V and A 
from the Gelfand isomorphism A =C{X), but they are intrinsically defined via <. 

Let A be a commutative unital C*-algebra. For a^b G A+, define a ^ b iff there 
exists n G N such that a < nb. Define aK,b iff a ^b and b ^a. This is an equivalence 
relation. Moreover, « is a congruence, that is, an equivalence relation ~ on a lattice 
L that is compatible with A and V in the sense that x ~ y and j/ ~ y' imply x Ax' ~ 
y A y' and x V x' ~ y V y'. Given some congruence ~ on L, one may define A and V on 
Lj ~ by [x] A |y] = [xAy] and [x] V |y] = [xVy], respectively, so that the set-theoretic 
quotient Lj ~ inherits the lattice structure of L and hence is a lattice in its own right. 

This quotient construction by a congruence preserves distributivity, so that 

La=A+/^. (12.44) 

is a distributive lattice. We will use the elements Da = [fl^] of La (indexed by a G 
Asa), where [«+] is the equivalence class in La of the positive part«+ in the canonical 
decomposition a = a+ — a^, with fl± >0 and a+a_ = 0; lattice-theoretically, one 
has a+ =aV0 anda_ = aAO. This gives a lattice homomorphism Asa a Da, 
whose restriction to A+ is just the canonical projection A+ La - These Da satisfy: 

Di=T; (12.45) 

DaAD^a = ±-, (12.46) 

Da = _L(a<0); (12.47) 

Da+fo < Da V (12.48) 

Da A Dfo < Dab', (12.49) 

Dafo<DaVD_fo, (12.50) 

where the inequalities may also be written as equalities, since x < y iff x = x A y. 
These relations are easy to check for A = C{X), and hence they are true for any A. 
The elements Da obviously exhaust A+, and eqs. (12.45) - (12.50) imply: 


a <b ^ Da ^ D;,; (12.51) 

Da = D,+ ; (12.52) 

D„a = Da (n G N); (12.53) 

Da* = (Da A Dfo) V (Da A D_£,; (12.54) 

DaAD£, = DaA*- (12.55) 

For the Gelfand spectrum we need the frame RIdl (La )> hence the relation <C. 

Lemma 12.7. For all Da, D;, G La, we have (with both q G and q G K+).' 

D* <C Da ijf 3,j>o Db < Da-^. (12.56) 
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Proof. From right to left, just choose Conversely, if A = C{X), it is easy 


to see that if there exists G La such that V = T and D^, A = _L, then there 
exists q>Q such that V Da-q = T. Hence V Da-q = T, so that 

Dfo = Dfo A (DcV Da_^) = Dfo A < Da-^. □ 

Note that by construction the map / in (12.40) is given by 

/(Da) = {Dc G La I Vd^gl^ ^ < Da}, (12.57) 

and, by Lemma 12.7, satisfies 

/(Da)<V{/(D«-^)l?>0}- (12.58) 

For later use, also note that (12.57) implies 

/(Da)=T 4^Da = T. (12.59) 

Theorem 12.8. The topology ^(L(A)) of the Gelfand spectrum L(A) of a commu¬ 
tative unital C*-algebra A is isomorphic to the frame of all regular ideals of La: 

^(L(A)) 5^RIdl(LA); (12.60) 

{a)G 2 :(A) I a)(a) >0 }g 4 Da, (12.61) 

or, equivalently, for the opens (r,s) G ^(K) with ensuing opens {r,s) in ff{E{A)), 

d^'(r,s) = {o G L(A) I ®(a) G (r,s)} G4/(Di._a A Da-r) {r<s). (12.62) 

Moreover, on this isomorphism, ff(E{A)) is a compact regular frame. 


The proof of this theorem is unfortunately beyond our reach; instead, we now give 
an alternative descriptions of the frame Rldl(LA), which will be useful for computa¬ 
tional purposes in topos theory. This again requires some more background in lattice 
theory. Let be a meet semilattice (i.e., a poset in which any pair of elements 

has an infimum; in most of our applications (L, is actually a distributive lattice). 

Definition 12.9. A covering relation on L is a relation <l C L x t^{L) — equivalently, 
a function L -G —written x <\U when {x,U) G <1, such that: 

1. If X G U then x <\U. 

2. If X <\ U and U <V (i.e., y <iV for ally GU) then x <1 V. 

3. Ifx <\ U then xAy<iU. 

4. Ifx G U andxG V, then x<\U AV (where U AV = {x Ay \ x G U,y GV}). 

For example, if (L, = (^{X),C) one may take x <i U iff x ^\/U, i.e., iff U 

covers x. Also here we have a closure operation £/ : D(L) —D(L), given by 

£/U = {xGL\x<jU}. (12.63) 
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This operation has the following properties: 

fu c £/U-, 

(12.64) 

UQ-b/V ^ b/U C b/V-, 

(12.65) 

^udb^'v CB^'iiuniv). 

(12.66) 


The frame ^{L,<\) generated by such a structure is then defined by 

^{L, <l) = {U GD{L)\£/U = U} = {U G ^iL)\x<lU ^xGU}; (12.67) 

the second equality follows because firstly the property = U guarantees that 
U G D(L), and secondly one has = [/iffx<l7 implies xGU . Defining 

t/ ~ y iff t/ < y and y < t/, (12.68) 

an equivalent description of the frame ^(L, <l) that is occasionally useful is 

^{L,<\)^ ^{L)/^ . (12.69) 

Indeed, the map U ^ \U] from (as defined in (12.67)) to ^ z. 

frame map with inverse [U] i— £/U. The idea behind the isomorphism (12.69) is 
that the map £/ picks a unique representative in the equivalence class [U], namely 
£/U. As in (12.40) - (12.71), also here we have a canonical map 

f:L^^{L,<\)-, (12.70) 

x^£/ilx), (12.71) 

which satisfies f{x) ^ V f{U) if x<\U . In fact, / is universal with this property, in 
that any homomorphism g : of meet semilattices into a frame Sf such that 

g{x) ^ V g{U) whenever x<\U has a factorisation g = <pof for some unique frame 
map (p : -G- . This may suggest the following result: 

Proposition 12.10. Suppose one has a frame ^ and a meet semilattice L with a 
map f : L ^ ^ of meet semilattices that generates ^ in the sense that for each 
U G ^ one has U = V{/W I "£U}. Define a cover relation <\ on L by 

x<U ijff{x)^\/f{U). (12.72) 

Then one has a frame isomorphism = ^{L,<\). 

We now turn to maps between frames, from the point of view of coverings. 

Definition 12.11. Let (L, <) and {M, a) be meet semilattices with covering relation 
as above, and let f* : L~^ ^P{M) be such that: 

1. f*{L)=M; 

2. f*{x)Lny)Af*{xLy); 

3. x<U^nx) A r iU) (where f*{U) = U„ej/ fiU)). 
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If L and M have top elements T i and T m, respectively, then the first condition 
may be replaced by f*{Ti) =Tm- Define two such maps to be equivalent 

if fl {x) ~ /I (x) (i.e., fl (x) ◄ /I (x) and /| (x) ◄ /j* (x)j for all x G L. A continuous 
map f ; (M, ◄) —>■ (L, <) is an equivalence class of such maps f* : 

Our main interest in continuous maps lies in the following result: 

Proposition 12.12. Each continuous map f : (M, ◄) —(i, <l) is equivalent to a 
frame map (/) ^ If' (M, ◄), given by 

,^if):U^sff{U). (12.73) 

We may now equip with the covering relation defined by (12.72), given 
(12.60) and the ensuing map (12.57). Consequently, by Proposition 12.10 one has 

(12.74) 

which yields the following expression for the constructive Gelfand spectrum: 

GD{La)\x<iU ^xGU}. (12.75) 

This lattice becomes computable through a lemma that is crucial for what follows: 

Lemma 12.13. In any topos, the covering relation <3 on La defined by (12.72) with 
(12.60) and (12.57), is given by Da <\U ijffor all q > 0 there exists a (Kuratowski) 
finite Uo QU such that Da-ij ^ V Uq. If U is directed, this means that there exists 
Dh GU such that Da-q ^ D^,. 

Proof The easy part is the “4=” direction: from (12.58) and the assumption we have 
/(D«) ^ y f{U) and hence Dq <l t/ by definition of the covering relation. 

In the opposite direction, assume Da <iU and take some q > 0. From (the proof 
of) Lemma 12.7, V Dq^a = T, hence y f{U) V /(D^_a) = T. Since ff{E) is 
compact, there is a finite Uq C U for which y f{Uo) V f{Dq^a) = T, so that by 
(12.59) we have V Dq^a = T, with D* = V Uq. By (12.46) we have 

Da-qADq^a = E, (12.76) 


and hence 

Da—q = Da—q A T = Da—q A (D;, V Dq—a) — Da—q A ^ D;, = ^ Uq. D 

If A is finite-dimensional. La is a finite lattice. In that case, since Da-q = Da for 
small enough q, one simply has x <11/ iff x < V and the condition x<\U ^ xGU 
in (12.75) holds iff t/ is a (principal) down set, i.e. U =fx for some x G La (not the 
same x as the placeholder x in (12.75)). Hence for finite-dimensional A we obtain 

ff(E{A)) Idl(LA) = Ux I X G La}. (12.77) 
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12.3 Internal Gelfand spectrum and intuitionistic quantum logic 

We are now going to combine the (a priori independent) material in the previous two 
sections. The point of the above description of the topology (A)) of the Gelfand 
spectrum E (A) of a unital commutative C*-algebra A is that it may be “internalized” 
to any topos (with natural number object, i.e., in which C*-algebras may be defined 
internally in the first place). The key to the ensuing generalization of Gelfand duality 
is that in topos theory (and more generally in constructive mathematics) the space 
E{A) in set theory needs to be replaced by the corresponding/rame i^(E{A)), or 
preferably by its associated locale, which confusingly is denoted by E{A), even 
though it is the same thing as ^{E{A)) and neither may be spatial (in being the 
topology of some space); see §C.l 1 and §E.4 for this bizarre notation. Similarly, we 
write f : X ^ Y for a map between locales, which is essentially the same as the 
frame map : ff{Y) —>■ but seen as a map in the opposite direction (where 

once again nothing is assumed about possible spatiality of the frames in question). 

Using this notation, the constructive Gelfand isomorphism (which is valid in 
any topos T in which commutative C*-algebras make sense) states: 

Theorem 12.14. For each (internal) commutative unital C*-algebra A in T there 
exists a compact regular locale E{A) such that one has a Gelfand isomorphism 

A'^C{E(A),C). (12.78) 

Furthermore, the locale E{A) is uniquely determined by A up to isomorphism and 
its corresponding frame is given by Theorem 12.8 (or, more explicitly, by (12.75) in 
conjunction with Lemma 12.13, all of which makes sense internally). 

Here = denotes (internal) isomorphism of (commutative) C*-algebras, and the no¬ 
tation C(E(A),C) stands for the object of all frame maps from ^(C) to 0'{E(A)) 
(which object turns out to be a commutative C*-algebra in any case). As usual, we 
denote the Gelfand transform A^C{E(A),C)hy ai-^d, where, as explained above, 
the locale map a : E(A) —C is really the reverse reading of the frame map 

a-^ : ^(C) ^ ^iE(A)). (12.79) 

Note that in Sets, the latter is given by its literal meaning, given d : (O (o(a). 

We will shortly apply this formalism to our internal C*-algebra A in the topos 
T(A), but since these computations are a bit involved, as a warm-up we first apply 
our machinery to a very simple case, namely A = C" in Sets. Recall (12.44) etc. 

For A = C” we have A+ = (]R")+, in which (ri,... ,r„) « (si,... ,s„) just in case 
r, = 0 iff s, = 0 for all / = 1,... n. Hence each equivalence class under « has a unique 
representative of the form [ki,... ,k„] with k, = 0 or k, = 1; the pre-images of such 
an element of La in A+ under the natural projection A+ —A+/ « are the diagonal 
matrices whose /’th entry is zero if k, = 0 and any nonzero positive number if k,- = 1. 
The partial order in La is pointwise, i.e. [ki,...,k„] < [li,...,lf\ iff k, < 1, for all i. 
Hence Lcn is isomorphic as a distributive lattice to the lattice = 7?^(C") 

of projections in D„(C), i.e. the lattice of diagonal projections in M„{C). 
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Under this isomorphism, [A:i,...,A:„] corresponds to the matrix diag(A:i,... ,A:„). If 
we equip £P(C") with the usual partial ordering of projections on the Hilbert space 
C”, viz. e <f whenever eC” C /C" (which coincides with their ordering as element 
of positive cone of the C*-algebra M„(C)), then this is even a lattice isomorphism. 
Hence by (12.77), the frame ^(Z(C")) consists of all sets of the form le, e € 
.^(C"), partially ordered by inclusion. This means that 

^(Z(C"))^.?^(C"), (12.80) 

under the further identification of C with p € ^(C"). This starts out 

just as an isomorphism of posets, and turns out to be one of frames (which in the 
case at hand happen to be Boolean). To draw the connection with the usual spectrum 
C” = {1,2,... ,n} of C”, we note that the right-hand side of (12.80) is isomorphic to 
the discrete topology ^(C") of C” (i.e. its power set) under the frame isomorphism 

^(C”) 4 ^(C"); 

diag(^i,...,^„) ^ {/e{l,2,...,n}4, = l}. (12.81) 

We now describe the Gelfand transform (12.78) - (12.79) for self-adjoint a, so 
that one has a (locale) map Asa —>■ C(Z(A),]R). Let a = (ai,...,a„) € = R”. 

With Z(C") realized as C”, this just reads d(i) = a/, for d : C” —>■ C. The induced 
frame map : ^(C) —>■ ^(C”) is given hy 1/ >—>■ {/ G (1,2,... ,n} | a,- G t/}, and 
by (12.81), this is equivalent to 

d-' : ^(R) ^(C"); 

U ^ lu{a), (12.82) 

where U G ^(R), and the right-hand side denotes the spectral projection lu{a) 
defined by the self-adjoint operator a on the Hilbert space C". 

After this warm-up, we now compute the Gelfand spectrum in our topos 

T(A), for the special case A = M„ (C) (which is still an exercise for the general case). 
For simplicity we write L for the lattice La in T(A); similarly, L stands for L(A). 

First, for arbitrary A, the lattice functor L can be computed “locally”, in the sense 
that Lo(C) = Lc, see Proposition 12.17 in §12.4 below, so that by (12.44) one has 

Lo(C)=C+/«. (12.83) 

Let .^(C) be the (Boolean) lattice of projections in C, and consider the functor 

^o(C) = ^{C)-, (12.84) 

^i{C(ZD) = {0»{C)^ 3^{D)). (12.85) 

As in the case A = C” just discussed, it follows that we may identify Lo(C) with 
.^(C) and hence we may and will identify the functor L with the functor 3^ . 
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Second, whereas in Sets eq. (12.77) makes ^(£) a subset of L, in the topos 
T(A) the frame is a subobject ^ It then follows from (12.11) that 

^(E)(C) is a subset of Suh(^^r), the set of subfunctors of the functor ^ : ^(A) —>■ 
Sets restricted to fC C ^(A). To see which subset, define 

= {S& Sub(^.,-c) I VD D C 3xz) € ^{D ): S{D) =4,xd}. (12.86) 

Thus Sub^(^ic) consists of subfunctors S of ^tc that are locally down-sets. It 
then follows from (12.77) and the local interpretation of the relation <l in T(A) (see 
Lemma 12.18 in §12.4 below) that the subobject ^ f2- in T(A) is the functor 

^(^)o(C) = Sub^(^^c); (12.87) 

^(Z)i(CCD) = ^ ff{L){D)), (12.88) 

where is inherited from (of which is a subobject), and hence is just 

given by restricting an element of i^(Z)(C) to \D. Writing 

Subrf(^) = {5 G Sub(^) I VD G '^(A) 3xd G ^{D) : S{D) =ixD}, (12.89) 

it is convenient to embed Subrf(^ic) C by requiring elements of the 

left-hand side to vanish whenever D does not contain C. We also note that if S 
is to be a subfunctor of one must have S{D) C S{E) whenever D C E, and 
that Ixd Q\rXE iff xd < xe in !^{E). Thus one may simply describe elements of 
ff{'Q{C) via maps 5 : “^(A) —>• ^P{A) such that; 


S{D) G (12.90) 

5(D) =0 if D^tC(i.e. C^D); (12.91) 

5(D) < 5(D) if C C D C £. (12.92) 

The corresponding element 5 of ff{'Q{C) is then given by 

5(D)=j,5(D), (12.93) 


seen as a subset of 3^{D). Hence it is convenient to introduce the notation 

^{E)^c = {S :tC ^ :^{A) \ 5(D) G ,?^(D), 5(D) < 5(D) ifD C £}, (12.94) 

of which we single out the case C = C-1a, which will be of great importance: 

ff{E) = {5: ^(A) ^{A) I 5(C) G ^{C), S(C) < 5(D) if C C D}. (12.95) 

Both are posets and even frames in the pointwise partial order with respect to the 
usual ordering of projections (which algebraically means e < fiffef = e), i.e., 

S<T ^ 5(C) < T{C) for all C G ‘^(A). (12.96) 
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In terms of (12.94) - (12.95), we then have isomorphisms 

(12.97) 

^(Z)(C)o^^(2:)tc- (12.98) 

More importantly, the frame 0’{E) in Sets is the key to the external description 
of the internal frame ^{E) in T(A); see the end of §E.4. Since “^(A) carries the 
Alexandrov topology, by (E.84) this description is given by the frame map 

ff{E), (12.99) 

given on the basic opens & ^(‘^(A)) by 

71^\\D)=x^d-E^ \ (EDD); 

E^Q{E^D). ( 12 . 100 ) 

As explained before, even in Sets, in principle ff{E) is just a notation for a frame, 
without suggesting that there exists an underlying space E whose topology it is. 
In this case, however, there is such a space (as we shall show in the next section), 
and also (12.99) is in fact the inverse image map to a genuine map Ttz : E ^ 
between spaces (as opposed to the formal notation used for a locale map). 

We now state the Heyting algebra structure of ^(E). Eirst, top and bottom are 


T(C) = 1 forallC; (12.101) 

_L(C) = 0 forallC. (12.102) 

The logical operations on ^{E) may be computed from the partial order as 

(SAT)(C) =S(C)AT(C); (12.103) 

(SVT)(C) =S(C)VT(C); (12.104) 

^(C) 

(S—^T)(C)= /\ S(D)-^VT(D); (12.105) 

DDC 

^(C) 

{^S){C)= /\S{D)^- (12.106) 

DDC 

^{C) 0^{D) 

h^S){C)= A V S{E), (12.107) 

DDC EDD 


where the right-hand side of (12.105) (and similarly (12.106) - (12.107)) is short for 

.0^{C) 

A S{D)^ \/T{D) = \/{eG^{C)\e< S(D)^ V T{D) MD D C). (12.108) 

DDC 
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Recall that a Heyting algebra is Boolean iff -^-'S = S for each S. One sees from 
(12.107) that (at least if n > 1) the property = S only holds iff S is either T or 
_L, so that the Heyting algebra ^{E) = CO(E{A)) is properly intuitionistic. 

Since from both a physical and a logical point of view the Heyting algebra 
^(E(A)) has vast advantages over the projection lattice J3^{A) of Birkhoff and von 
Neumann, we propose it as a candidate for a new quantum logic. Let us explain why. 

Physically, in von Neumann’s approach each projection e G J3^{A) defines an 
elementary proposition, whereas in Bohr’s (where the classical context C is crucial) 
an elementary proposition is a pair (C,e), where e G ,^(C) is a proposition a la von 
Neumann (who lost sight of the context C). If for each such pair (C,e) we define 

S(c,e) ■ ^{A) -G ^(A); (12.109) 

D^e (CCD); (12.110) 

D _L otherwise, (12.111) 

we see that each pair (C,e) injectively defines an element of GiE). Furthermore, 
each element 5 of is a disjunction over such elementary propositions, since 

5= V %.%))■ (12.112) 

Ce‘g(A) 

In contrast to traditional quantum logic, both logical connectives A and V on G{E) 
are physically meaningful, as they only involve local conjunctions S{C) A T (C) and 
disjunctions 5(C) V T (C), for which 5(C) G 3^{C) and T (C) G 3^{C) commute. 

Logically, the absence of an implication arrow in quantum logic has always been 
worrying; this has now been put straight in G{E), where —belongs to the defining 
structure and behaves well logically. Truth attribution in quantum logic is equally 
suspicious: for any state O) on A one declares a proposition e G ^(A) true iff (o{e) = 
1, wA false iff (o{e) = 0, with no verdict otherwise (except probabilistically). 

We, however, define a natural Kripke semantics (cf. §D.3) on P = “^(A) by 

Vco : G{E) ^\JppevfG{A)) = Gf)f{A)); (12.113) 

y<«(5) = {C G '^(A) I a)(5(C)) = 1}, (12.114) 

where “^(A) carries the Alexandrov topology as usual. Note that Vco{S) indeed de¬ 
fines an upper set in ‘if (A), for if C C D then 5(C) < 5(D), so that (b(5(C)) < 
(b( 5(D)) by positivity of states, and hence (o{S{D)) = 1 whenever co{S{C)) = 1 
(given that (o{S{D)) < 1, which is true since 0 < (o(e) < 1 for any projection e). 

As explained in §D.3, a proposition 5 G G{E) is true in a state (O if 4^(5) = 
“^(A), i.e. the top element of the frame G{ff{A)y, we also declare it false if yft)(5) = 
0, i.e. the bottom element of G{‘f{A)). Then -15 is true iff 5 is false, and 5V T is true 
iff either 5 or T is true (since V(o{S) = ‘if (A) iff 5(C-1) = 1, which forces 5(C) = 1 
for all C). Consequently, (12.114) simply lists the contexts C in which 5(C) is true. 
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12.4 Internal Gelfand spectrum for arbitrary C*-algebras 

In this section we compute the internal Gelfand spectrum E{A) = Z in T(A) for an 
arbitrary unital C*-algebra A. Recall Definition D .6 (in §D.l) of a free lattice 
on a set S, and its refinement in quotienting by a congruence on explained after 
that definition. According to Definition E.21, lattices can be defined in any topos. 
The following “locality lemma” shows that the construction of a free lattice on some 
object makes sense in functor toposes, and so does its refinement just mentioned, at 
least as long as the congruence in question is defined through equalities. 

Lemma 12.15. Let T = [C, Sets] be any functor topos (where C is some category). 

1. There exists a free distributive lattice jSf c G T on any object 5 G T, which can be 
computed locally: the object part of is given by 

{^UC)=^sfcp (12.115) 

where .^Sq{C) defined in Sets, and the arrow part is defined as follows. If f : C —>■ 
D, then (^^ 5)1 (/) is the unique arrow making the following diagram commute: 

5o(C) ^ 1 ^ 5o(D) 

(12.116) 

■^So(C) 

2. The same is true if subject to relations defined by equalities among ele¬ 

ments of .Sf ^ (as long as these equalities generate a congruence). 

Proof. The proof is an elaborate verification, which may be summarized as follows. 

1. Existence and uniqueness of the arrow (^ 5)1 (/) in (12.116) follows from the 
universal property of the free distributive lattice -^sfC) in Sets; just consider 
the function J§f o5j(/) : 5 q(C) —The claim follows from the fact that 
jSf c (defined locally) has the required universal property (as can be established 
locally, from the corresponding property of each ( Jf ^)n(C)) and hence is unique. 

2. This is proved in a similar way, since also a free distributive lattice ~ on 
generators S with relations given by equalities has a universal property, cf. the 
final part of §D.l. This works locally in a functor topos by rule no. 7 of Kripke- 
Joyal semantics, cf. §E.5 (which states that equalities are enforced locally). □ 

We will apply this lemma to T = T (A), as in (12.1), with C = ‘rf{A). This hinges on a 
lemma of independent interest, which we first state for Sets, i.e., for “ordinary” com¬ 
mutative unital C*-algebras A, to be subsequently internalized to our topos T(A). 

Lemma 12.16. The lattice La in (12.44) is (constructively) isomorphic to the lattice 
L'^ freely generated by the symbols Da, a G Asa and the relations (12.45) - (12.50). 
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Proof. The point is that the map a Da from Asa to is suijective; this follows 
from the relations (12.45) - (12.50) through their consequences (12.51) - (12.55). 
The pertinent isomorphism L'^ = La is then given by mapping D^, o [a+] on gener¬ 
ators (note that in the original discussion of La following (12.44) this map was the 
definition of D^; this time, these play an independent role as generators of the lattice 
L'a, and in the present proof they are related to the elements [a+] G La). □ 

Now let A be a (not necessarily commutative) unital C*-algebra (in Sets), with 
ensuing internal commutative C*-algebra A in the functor topos T(A), cf. Theorem 
12.3. Our goal is to apply the constructive definition of the Gelfand spectrum 2^ (A), 
or rather of its topology (seen as a frame, so that 2^(A) is seen as a locale) 

in § 12.2 to A. The first step concerns the lattice La, which in T(A) is denoted by La- 
Here and in what follows, we try to avoid notational confusion by writing Da for the 
formal variable indexed by a (which is a variable of type A in T(A)), whilst writing 
Da for the actual element [c+] of Lc if we apply (12.44) etc. to C G “^(A). 

Proposition 12.17. For each C G “^(A) one has 

La{C)=Lc, (12.117) 

where Lc is defined in Sets through (12.44) (with A C), where it may be computed 
through Lemma 12.16. Furthermore, if C C D, then the map La{C) —>■ La{D) given 
by the functoriality of La, i.e., Lc —>■ Lo, maps each generator Dc in Lc (where 
c G Csa) to the same generator in L^. This is well defined, because c G Dsa. ond this 
inclusion preserves the relations (12.45) - (12.50). We write this as Lc ‘-t Lo. 

Proof. Internalizing Lemma Lemma 12.16 to our functor topos T(A), it follows that 
the internal lattice ^ in T(A) is isomorphic to a distributive lattice freely generated 
by generators and relations given by equalities. Hence Lemma 12.15 applies to it. □ 

The next step is to move from La to the corresponding frame of regular ideals, 
cf. Theorem 12.8. Abbreviating ff{E{A)) = we first rewrite (12.60) as 

ff(E) ^ {t/ G Idl(L^) I V ^>0 Da-q &U^Da&U}. (12.118) 

To apply this to our functor topos T(A), we apply Kripke-Joyal semantics for the 
internal language of the topos T(A) (which is reviewed §E.5) to the formula Da <1 U. 
This is a formula (p with two free variables, namely of type La, and U of type 

^{La)=Q^^. (12.119) 

Hence in the forcing statement C Ih 9 (a) in T(A), we have to insert 

aeifAX qH){C) ^LcX Sub(L^|.^c): 

where fAy^c the restriction of the functor 

L 4 : “^(A) ^ Sets (12.120) 
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to fC C '^(A). Here we have used (12.117), as well as the isomorphism (12.11). 
Consequently, we have 

a = {Dc,U), (12.121) 

where Dc G Lc for some c G Csa, and IJ_\\C ^ Sets is a subfunctor of In 

particular, U_{D) C Lo is defined whenever D^C, and the subfunctor condition on 
U_ simply boils down to (/(O) C U^{E) whenever C Q D Q E. 

Lemma 12.18. In the topos T(A), the cover <l of Lemma 12.13 may be computed 
locally, in the sense that for any C G ^{A), Dc G Lc and U_ G Sub(L^II(;;.), one has 

C\hDa< U{Dc,U) ijfDc <c mC), 

in that for all q > 0 there exists a finite Uq C lf{C) such that Dc-q ^ V Uq. 

Proof We assume that yUo G U, so that we may replace Uq by D;, = yUo', the 
general case is analogous. We then have to inductively analyze the formula Da <1 U, 
which, under the stated assumption, in view of Lemma 12.13 may be taken to mean 

V^>o3Di,GL,l (Dfo G t/ A Da-q < D;,). (12.122) 

We now infer from the rules for Kripke-Joyal semantics in a functor topos that 

C\^{DaGU){Dc,U) (12.123) 

iff for all D D C one has Dc G U_{D); since U_{C) C UiD), this happens to be the 
case iff Dc G U{C). Furthermore, 


Clh(Dfo<D«)(D,,,D,) (12.124) 

iff Dp ^ Dc in Lc- Also, 

C\^{3o,eL^DbGU EDa-q^Db){Dc,U) (12.125) 

iff there is Dp G U_{C) such that Dc-q ^ Dp. Finally, 

CIF (V^>o3d,gl^ Dh GUADa-q < Dfo)(D„t/) (12.126) 

iff for all D D C and all ^ > 0 there is Dp G UiD) such that Dc-q ^ Dp, where 
Dc G Lc is seen as an element of Lo through the injection Lc ^ Lo of Proposition 
12.17, and U_ G Sub(^ll(-) is seen as an element of Sub(^ll£,) by restriction. This, 
however, is true at all D D C iff it is true at C, because UiC) C U_{D) and hence one 
can take Dp = Dp for the Dp G Lc that makes the condition true at C. □ 

Lemma 12.19. The spectrum of A in T (A) may be computed as follows: 

1. At C G “^(A), the set consists of those subfunctors U_ G Sub(L^||(^) such 

that for all D^C and all Dp G Lo one has: 

Dp U{D)^DpGU{D). 
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2. AtC- 1, the set ■ 1) consists of those subfunctors If G Sub(^) such that 

for all C G 'if (A) and all G Lq one has: 

Dc <c U{C)^DcGU{C). 

3. The condition that If = {t/(C) C Lc}ce’^{A) ^ subfunctor ofL^ comes down 
to the requirement that: 


C<fD^U{C)CU{D). 

4. The map -A given by the functoriality of 0'{'Q whenever 

C f D is given by truncating an element If ifC —>■ Sets of to fD. 

5. The external description of is the frame map 

;r^:^('^(A))^^(Z)(C-l), (12.127) 

given on the basic opens fD G ^f(if(A)) by 

7r^(tD)=ZtD:E^T (EDD); 

Ei-a±(E^D), (12.128) 

where the top and bottom T, _L af £ are given by {Le} and 0, respectively. 

Proof. By (12.75), i^(£) is the subobject of defined by the formula (p given by 

fD,eL^Da<U^DaGU, (12.129) 

whose interpretation in T(A) is an arrow from to [f. In view of (12.11), we 
may identify an element If G i^(£)(C) with a subfunctor of by (12.129) 

and Kripke-Joyal semantics in functor topoi, we have If G iff Cl h (p{U), 

with (p given by (12.129). Unfolding this using Kripke-Joyal semantics and using 
Lemma 12.18 (including part 1 of its proof), we find that If G ^(£)(C) iff 

^DDc'^D^eLo'^EDDDd <E U{E) ^DdG U{E), (12.130) 

where is regarded as an element of Le. This condition, however, is equivalent to 
the apparently weaker condition 

^D^c'^D,eLnE>d <DU{E>)^DaGU{D)- (12.131) 

indeed, condition (12.130) clearly implies (12.131), but the latter applied at D = E 
actually implies the first, since G Ed also lies in Le. 

Clauses 2 to 4 should now be obvious. Clause 5 follows by the explicit prescrip¬ 
tion for the external description of frames (which has been recalled in the previous 
section, after its initial description the end of §E.4). Note that each ff{'Ef)(C) is a 
frame in Sets, inheriting the frame structure of the ambient frame Sub(^llf^). □ 
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We now present the computation of = ^(^(A)) for general unital C*- 

algebras A. To explain the final formula, topologize the disjoint union 

y £(C), (12.132) 

Ce‘^(A) 

where £(C) is the Gelfand spectrum of C S ‘^(A), as follows, abbreviating 

‘^c = ^nE(C). (12.133) 

One has ^ iff the following two conditions are satisfied for all C G ‘^(A): 

1. ^(^(C)). 

2. For all D D C, if A G and A' G H (D) such that = X, then X' G ^d- 

In fact, is simply the weakest topology making the canonical projection 

n:E^ ^‘rf{Ay, (12.134) 

n{a) = C (a G E{C) c E^), (12.135) 

continuous with respect to the Alexandrov topology on “^(A). For U G ^(“^(A)), 

E§=\jE{C) (12.136) 

Ceu 

is a subset of E'^, with relative topology inherited from E^. In particular, for the 
basic opens U = fC of the Alexandrov topology on ^{A) we have 

E^c=U^W- (12.137) 

DDC 

Theorem 12.20. Let A be a unital C*-algebra A. The internal Gelfand spectrum 
^{E{A)) of our internal commutative C*-algebra A in the topos T (A) is the functor 

^{E{A))o : C ^ ^iEfc), (12.138) 

i.e., the frame (in Sets) of the open sets ofE^^^ in the topology defined after (12.132); 
if C C D, the arrow-part of the functor in question is given by 

^(Z(A))i : ^(Efc) (12.139) 

(12.140) 

Similarly, in the description o/T (A) as the category of sheaves Sh(‘^(A)), cf (E.84), 
the Gelfand spectrum is given by the sheaf (where U in (12.142)); 

^{E{A))o : U ^ &{Ef,) (U G ^('^(A))); (12.141) 

^(E(A))i : ^ ^ ‘^nEfi (^ G ^(E^)). (12.142) 
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Proof. The proof is based on Lemma 12.19, which implies that the internal frame 
Rldl (Lj) in T(A) is given by the functor 

Rldl(L^) : C {£ G Sub(L 4 |.,.c) | F{D) G RIdl(L£,) for all D D C}. (12.143) 

Here, since D is a commutative unital C*-algebra in Sets, according to (12.60) the 
set Rldl(Lo) may be identified with the topology where L(D) is the 

Gelfand spectrum of D in the usual sense. We will make this identification in the 
following step, which is the last step of the proof of Theorem 12.20. 

Lemma 12.21. The transformation 9 : Rldl (Lj) —>■ ^(L(A)) with components 
Oc : {£ G Sub(44|.,-c) I l-iD) G ^(2;(D))forall DDC}^ 

(12.144) 

DDC 

is a natural isomorphism of functors — i.e., an isomorphism of objects in T(A). 

Since Rldl fLj) and i^(L) are internal frames in T(A), it suffices to prove that each 
9c is an isomorphism of frames in Sets. Unfortunately, even this proof is a very 
lengthy (though straightforward) affair, for which we refer to the literature. □ 

Corollary 12.22. The external description (in Sets) of the internal locale £(A) (in 
T(A)) is given by the canonical projection (12.134). 

Note that both and “^(A) are topological spaces, so that (12.134) is a bona fide 
continuous map between spaces. This is worth stressing, since in general, an exter¬ 
nal description of an internal locale in a sheaf topos, though defined in Sets, is a map 
between locales (or, equivalently, between frames) that are not necessarily topolog¬ 
ical spaces. But in the case (12.134) at hand they are, so at least this time there is 
no confusion between ff{X) as both formal notation for a frame (not necessarily 
coming from a topology) and notation for the topology of a space see §C.l 1. 
Note that (12.95) is a special case of Theorem 12.20 or Corollary 12.22, for 

A=M„(C). (12.145) 

To see this, we identify = UcG'g’(A) as an element of ff{E^) with 

5 : '^(A) ^ J^(A) 

on the right-hand side of (12.95), where S{C) G is the image of ‘^c G 

i^{E(C)) under the isomorphism ^{E(C)) —>■ ,^(C) between the (discrete) topol¬ 
ogy of the (finite) Gelfand spectrum of C and the (Boolean) projection lattice of C 
derived earlier, see (12.80). Similarly, for U G ^('^(A)), the frame may be 

identified with maps 

S:U ^ ^(A) 

satisfying the conditions in (12.95). Of course, the special case (12.145) leading to 
(12.95) is very appealing, and was well worth treating in its own right! 
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Theorem 12.20 and Corollary 12.22 also give an explicit description of the gen¬ 
eral internal Gelfand isomorphism (12.78), whose real part in T(A) reads 

4a = C(^,l) =Frm(^(l),^(^)), (12.146) 

where the right-hand side, which denotes the object of frame homomrphisms from 
to ^{E) within T(A), is the definition of the middle term (which is just a 
notation). To understand the situation in T(A), one has to distinguish between: 

1. The object Frm(i^(R), ^(:£)) in T(A), defined as the subobject of the exponen¬ 
tial consisting of (internal) frame maps from i^(]R) to ^(Z). 

2. The set HomFrm(i^(M )5 of internal frame maps from the frame ^(R) of 

(Dedekind) real numbers in T(A) to the frame (i.e., the set of those arrows 

from i^’(R) to that happen to be frame maps as seen from within T(A)). 

The connection between 1. and 2. is given by A-conversion, i.e., the bijective cor¬ 
respondence between C ^ and A x C —B, cf. (E.153). Taking C = 1 (i.e. the 
terminal object in T(A)), we see that an element of the set Hom(A,B) corresponds 
to an arrow \ ^ . Eq. (12.8) yields 

Erm(^(l), ff{E)){C) = NatFrm(^(l)tC, ^©tc), (12.147) 

the set of all natural transformations between the functors ^(R) and both 

restricted to fC C ^{A), that are frame maps. This set can be computed from the 
external description of frames and frame maps in §E.4. Recall (12.4) etc. The frame 
i^(R)|C has external description 


7% ^ : ^(tC) ^ ^(tC X R), (12.148) 

where ttr :tC x R -^'[C is projection on the first component. The special case 
C = C • 1 yields the external description of ^ (R) itself, namely 

ff{^{A) X R), (12.149) 

where this time (with abuse of notation) the projection is ttr : “^(A) x R —>■ “^(A). 
By Corollary 12.22, the Gelfand frame ^{Efic has external description 

: ^(tC) ^ ^{Efic, (12.150) 

given by (12.128), with the understanding that D jjC (the special case C = C • 1 
then recovers the external description (12.99) of ^(E) itself). It follows that there 
is a bijective correspondence between two classes of frame maps: 

^ ^©tc (in T(A)); (12.151) 

' : ^(tC X R) ^ ^(Efic (in Sets), (12.152) 

where (pc must satisfy the condition that for any DjjC, 
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(Pc\tDxR)=XtD- (12.153) 

Indeed, such a map defines an element of Nat(^(K.)|C, in the 

obvious way: for D G^C, the components 

: ^(1)(D) ^ ^(£)(D) (12.154) 

of the natural transformation i.e. 

(12.155) 

are simply given by the restriction of to x K) C i^(tC x R); cf. (E.147). 
This is consistent, because (12.153) implies that for any U G ^(R) and C (G D C E, 

(Pc\tExU){F) < (pc\-tDxR){F), (12.156) 

which by (12.153) vanishes whenever F ^ D. Consequently, 

(PcHl'ExU){F)=OifF^D, (12.157) 

so that (p^^{D) actually takes values in (rather than in as might 

be expected). Denoting the set of frame maps (12.152) that satisfy (12.153) by 
Frm'(^(tC X R), we obtain a functor 

Frm'(^(t(-) xR),^(2;)|_) : ^{A) ^ Sets, (12.158) 

with the stipulation that for CCD the induced map 

Frm'(^(tC X R), ffiE)^c) ^ Frm'(^(tD x R), ffiE)^D) 

is given by restricting an element of the left-hand side to x R) C ^(fC x R); 

this is consistent by the same argument (12.157). 

The Gelfand isomorphism (12.78) is therefore a natural transformation 

A ^Frm'(^(t-xR),^(i;)l_), (12.159) 

which means that one has a compatible (i.e. natural) family of isomorphisms 

C ^ Frm'(^(tC x R), ff{E)^c)\ 
a : ^(tCxR) ^ ^(F)|c- (12.160) 

On basic opens xU G x R), with D D C, we obtain 

d^\tDxU) :Ei-^ lu(a) if FDD; 

F 0 if F^D. (12.161) 
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Here lj/(a) is the spectral projection of a in U, cf. (12.82); as it lies in ^{C) and 
C C D C E, the projection lj/(a) certainly lies in ^{E), as required. Furthermore, 
we need to extend ' to general opens in f C x K by the frame map property, and 
note that (12.153) for is satisfied. 

This analysis also holds in the topos Sh(‘^(A)) of sheaves in “^(A) (as always, 
equipped with the Alexandrov topology, cf. (E.84). It then follows from (12.159) 
and (12.141) that as a sheaf. 


CiE,Q:U (12.162) 

where E^ is given by (12.136); ifUCV, the map C{Ey,C) —>■ C{E^,C) is given by 
the pullback of the inclusion E^ Ey (that is, by restriction). It then follows from 
(12.162) that the isomorphism (12.146) is given by its components 

A{U)^C{E^,C). (12.163) 

In particular, the component of the natural isomorphism in (12.146) at t/ = fC is 

C^C(Efc,C). (12.164) 

A glance at the topology of E^ shows that the so-called Hausdorffication, which 

for a general compact space may be defined either directly, or C*-algebraically by 
= E{C{X)), and coincides with the left adjoint of the forgetful functor from 
the category of compact Hausdorff spaces (and continuous maps) to the category of 
compact spaces (and continuous maps), is given by = E{C), so that 

C{Efc,C)^C{E{C),C), (12.165) 

where the isomorphism is given by restricting / G C{E^^,C) to E{C) C 
Corollary 12.23. The internal Gelfand isomorphism 

A^C{E,C), (12.166) 

which is a natural isomorphism between functors 'if (A) —> Sets, is given at each 
C G “^(A) by the usual Gelfand isomorphism for the commutative C*-algebra C: 

Ao(C) = C A C{E{C),C) ^ C(Z,C)o(C). (12.167) 

At the end of the day, the Gelfand isomorphism (12.146) therefore turns out to 
simply assemble all isomorphisms (12.167) for the commutative C*-subalgebras 
C of A into a single sheaf-theoretic construction. Incidentally, taking C = C • 1 in 
(12.164) shows that {E^)^ is a point, which is also obvious from the fact that any 
open set containing the point E(C • 1) of must be all of E'^. 
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12.5 “Daseinisation” and Kochen-Specker Theorem 

The internal Gelfand transform (12.166) constructed in the previous section acts on 
each commutative subalgebra A G ‘^(A). What about A itself? There is a more subtle 
transform, inspired by the remarkable “Daseinisation” construction of Doring and 
Isham (whose name has unfortunately been inspired by the controversial German 
philosopher Heidegger), which turns self-adjoint elements a of A into continuous 
functions 5(a) on the topos-theoretical phase space whose range is the so-called 
interval domain M (which is a fuzzy version of M). Hence we will define a map 

5 ;Asa^C(2;^,m), (12.168) 

which, alas, is defined only if A is a von Neumann algebra; we shall therefore as¬ 
sume this throughout this section. Similarly, the notation ‘rf{A) will now stand for 
the poset of abelian von Neumann subalgebras of A (as opposed to abelian C*- 
subalgebras of A, as in the remainder of this book). 

“Daseinisation” requires two slightly unusual concepts, the first of which is the 
said interval domain IR. To motivate its definition, consider Brouwer’s approxima¬ 
tion of real numbers by nested intervals with endpoints in Q. For example, the real 
number n can be described by specifying the sequence 

[3,4], [3.1,3.2], [3.14,3.15], [3.141,3.142],... 

This description of the reals is formalized by IR, defined as the poset whose ele¬ 
ments are compact intervals [a,b] in R (including singletons [a,a] = {a}), ordered 
by reverse inclusion (for a smaller interval means that we have more information 
about the real number that the ever smaller intervals converge to). This poset is a 
so-called dcpo (for directed complete partial order)-, directed suprema are simply 
intersections. As such, it carries the Scott topology, whose open sets are upper sub¬ 
sets U of IR with the additional property that for every directed set D with \/D GU 
the intersection DDU h nonempty. This means that each open interval {p, q) in R 
(with p = —oo and q = -foo allowed) corresponds to a Scott open 


U{p^q) = {[a,b\\p <a<b <q}. (12.169) 

Indeed, these opens form a basis of the Scott topology ^scott(IIR) = ^(IR) of IR. 
This topology is, of course, a frame, so far defined in Sets. However, this frame is 
easily internalized to any (pre)sheaf topos, similar to the Dedekind reals (12.3) - 
(E.149); in particular, in T(A) we have 

^(IR)o:Ch^^((tC)xIR), (12.170) 

with external description as a locale (see §E.4) given by the canonical projection 

;ri :‘r(A) xIR^‘r(A). (12.171) 
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The second ingredient of “Daseinisation” is the spectral order on Asa- The par¬ 
tial order < defined in §C.7 (in which a<b'\ff (o{a) < co{b) for all states (O on A) 
has good linearity properties in that it makes A+ a convex cone in the real vector 
space Asa (cf. Definition C.50), but it is terrible from a lattice point of view (unless 
A is abelian): for example, for A = B(H), suprema aV b and infima aAb exist iff 
either a <b or b < a (and indeed Asa is a lattice with respect to < iff A is abelian). 
However, there is a different order on Asa that turns it into a conditionally (or bound- 
edly)complete lattice, i.e., a poset X with the property that if some subset S CX has 
an upper bound (i.e., there is x G X such that s < x for each s G S), then it has a 
lowest upper bound (i.e., V S exists), and similarly for (greatest) lower bounds. 

Definition 12.24. For a,b G Asa soy that a <s b (i.e., a is less or equal than b in 
the spectral order) ijf a'‘ < b'' for each n G N. 

It can be shown that a <sb iff for each A G M (note the change of order), 

where is the spectral projection l(-oo,2.]n(T(a)('^)’ This, in turn, implies, that 

a <s b iff iJ.(o{a < X) > lJ.a){b < X), (12.172) 

for each (normal) state O) on A and each A G K, where 

Ba>{a <X) = ®(l(_„_;i,]ncy(a)(a)) (12.173) 

is the Born probability for the outcome a < X in state (O (and similarly for b). Fur¬ 
thermore, if a and b commute, or if a and b are both projections, the a <s b iff a < b, 
i.e., <s coincides with the usual partial order < iff A is abelian, and <s restricts to 
< on the projection lattice t^{A) of A. For each a G Asa C G 'if (A), we define 

Si:(a) = \/{bGCs,lb<,a}; (12.174) 

dS(a) = /\{bGCsala<,b}, (12.175) 

called the inner and outer Daseinisation of a with respect to C, respectively; those 
objecting to Heidegger might prefer to simply call these the inner and outer local¬ 
izations of a with respect to C. For projections, these expressions simplify to 

4(^) = V{/e^(C)l/<.4; (12.176) 

dS{e) = /\{fGt^iC)\e<,f}, (12.177) 

and in fact one has a very nice categorical description, in that 8^ : (A) ^ 

and 8^ : ^(C) are the right and left adjoint, respectively, of the inclusion 

functor t^{C) ^ in the category of complete orthomodular lattices. 

We are now in a position to define the map (12.168): for a G Asa we put 

8{a): (C,co) ^ [a)(5^(a)),a)(5c(a))], (12.178) 

where (as the notation indicates) the point (C, CO) G X(C) C is just co G X(C). 
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It is easily checked that the right-hand side of (12.178) makes sense, since positivity 
of states and (12.174) - (12.175) obviously imply a)(5^(a)) < Co{5^{a)). Also, 5{a) 
is continuous, so that 5 is well defined. If we define a closely related map 

(12.179) 

5{a){C,co) = (C,5(a)(C,cu)), (12.180) 

then 5(a) is the external description of an internal locale map 

5(a):Z(A)^m. (12.181) 

In view of this, we may regard (12.168) as a hybrid (i.e. “category mistake”) map 

5 :Asa^C(^(A),m); (12.182) 

see the text below (12.146), with K IK, for the meaning of the right-hand side. 

The relationship between 5 and the Gelfand transform (12.166) is as follows. 
For a G Asa, let W*(a) be the unital commutative von Neumann algebra generated 
by a = fl* and 1,4 within A. Using (12.164), we then have a Gelfandish isomorphism 



(12.183) 

C 1—>■ c. 

(12.184) 

In particular, since a G W*(a), we obtain a continuous function 


a ■ 

(12.185) 

Furthermore, we have an inclusion 


I : R IR; 

(12.186) 

xH> [x,x], 

(12.187) 


which is continuous, and hence induces a map C(Z'^,K) —>■ C(Z'^,IR), as well as 
maps —>■ Then the following diagram commutes: 



(12.188) 


In words, the restriction of the “Daseinisation” 5(a) : ER of a to the open 

subset , C takes values in R C IR, and as such coincides with the Gelfand 

transform d of a, seen as a map (12.185). Hence, as might be expected in quantum 
mechanics, any fuzziness of 5(a) is only noticeable outside its own context W*(a). 





488 


12 Topos theory and quantum logic 


The “Daseinisation” construction enables one to interpret propositions a G {p, q) 
as open subsets of the “phase space” as in classical physics, where a : X —K 
would be a continuous function on a phase space X, and one would say that 

[[a G (P,?)]]CM = a-\p,q) G (12.189) 

In quantum mechanics, one would interpret a G {p,q) as the spectral projection 

[[a G (p,?)]]qm = = l(p,g)nG{a){a), (12.190) 

or, equivalently, with the corresponding closed subset of the ambient Hilbert space. 
In our quantum toposophy setting, however, we may adapt (12.189) as 

[[« G (p,^)]]qt = 5(fl)-'(t/(p,,)) G ffiE^). (12.191) 

Similarly, one may interpret a G {p,q) as an internal open subset of the internal 
Gelfand spectrum ,E(A), as follows. For any locale F in a topos T, an internal open 
in ^{Y) is defined as an arrow 1 — ^{Y), where as usual 1 is the terminal object in 
T. In the case at hand we have Y = ,T(A), and use the composition 

(Pi?) S(a)^^ 

^iE(A)), (12.192) 

where the natural transformation (p, q) has components 

(F,gt (*)=tCxt/fa,p (12.193) 

cf. (12.170), and : i^(M) —>■ ^(Z(A)) is the frame version of the locale map 

(12.181), whose component at C, i.e., 

8{a)c^ : ^((tC) x IR) ^ (12.194) 

is given on basic opens in (fC) x M, with D D C and p < q, by 

X (12.195) 

We therefore obtain the quantum-toposophical interpretation of a G {p,q) as: 

[[ qG (p,g) ]]QT : 1 ^ ^(^(A)); (12.196) 

[[qG ip,q)]]QT = 5{ay'o{p,q). (12.197) 


We are now going to combine this expression with a construction relating states 
CO G 5(A) to arrows from ff(E{A)) to the truth object Q in T(A). This construction 
generalizes the fundamental bijective correspondence between states on commuta¬ 
tive (unital) C*-algebras A and probability measures on its Gelfand spectrum 1^(A) 
(cf. Theorem B.24) to the non-commutative case. 
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To this end, we first need to replace probability measures on spaces by probability 
measures on locales. This, in turn, requires the lower real numbers M/, which may 
be identified with proper subsets x/ C Q with the following two properties: 

1. If p G X/, then there exists <7 G x; with p < q. 

2. If p < ^ G xi, then p G xi (i.e., x/ is a lower subset of Q). 

In Sets, the lower reals may be identified with K (in Hilbert’s definition) by identi¬ 
fying X; with its supremum x = supx;, but in arbitrary toposes (that admit internal 
natural and hence rational numbers) they drift apart. Similarly, one defines the upper 
real numbers ]R„ as proper upper subsets Xu CQ such that p GXu implies that there 
exists q G Xu with p > q', once again, in Sets, may be identified with Hilbert’s K 
by taking x = infx„. The Dedekind real numbers then, are pairs (xi,Xu) where 
X/ G M; andx„ G Km are such that x; Hxm = 0 and for each p,q GQ with p <q, either 
p G X; or ^ G x„. In Sets these may be identified with supx; = infx„ = x, so that 
K^ = K, but in many toposes K/, K„, and K^/ are all different. For example, we have 
already seen that in sheaf toposes Sh(X), the Dedekind reals are given by the sheaf 
(E.150), but the lower reals turn out to be defined by 

{Ri)o:U^L{U,R), (12.198) 

where U G ^(X) and L(t/,K) is the set of all lower semicontinuous functions from 
t/ to K that are locally bounded from above (and similarly for K„, mutatis mutandis). 
In particular, in T(A) we have the functor 

(l,)o :Ch^L(tC,K), (12.199) 

which is quite different from (12.3) (and similarly for K„). 

Definition 12.25. A probability measure on a locale X is a monotone map 

p:&{X)^[0,l]i, ( 12 . 200 ) 

where [0, 1 ]/ is the collection of lower reals between 0 and 1 (defined by replacing 
Q in the definition ofRi by the set of all rationals 0 <q < 1), that satisfies 

p(T) = 1; (12.201) 

p(t/)+p(y) = p(t/Ay)+p(t/vy); (12.202) 

(12.203) 

for any directed family (Ux) in ff{X). 

Compared with (probability) measures on cr-algebras, we see that (probability) mea¬ 
sures on locales are merely defined on open sets (as opposed to measurable sets, 
which include opens), but this weakening is compensated for by the much stronger 
(i.e. uncountable) additivity axiom (12.203). Indeed, in Sets, ifX is a compact Haus- 
dorff space, one even has a bijective correspondence between regular probability 
measures p' on X as a space and probability measures p on X as a locale. 
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This definition makes sense in constructive mathematics, and hence it may be in¬ 
ternalized to T(A). Doing so, probability measures on the internal Gelfand spectrum 
E{A) turn out to correspond to the following notion (cf. Definition 2.26). 

Definition 12.26. A quasi-state on a unital C*-algebra A is a map co :A^C that is 
positive and normalized (©(Ia) = 1 ), satisfies (o{b + ic) = (o{b) + /a)(c) forb* = b 
and c* = c, and is linear on each commutative unital C*-algebra in A. 

Theorem 12.27. There is a bijective correspondence between quasi-states (O on A 
and probability measures IJ-^on the internal Gelfand spectrum ,£(A). 

The proof uses the fact that given the (Alexandrov) topology on “^(A), a function 
fC [0,1] is lower semicontinuous iff it is order-preserving (i.e., monotone); since 
[0,1] is bounded, the condition of local boundedness is trivially satisfied and hence 
L(tC, [0,1]) consists of all order-preserving functions from fC C “^(A) to [0,1]. 

Proof. Any probability measure on r(A) is a natural transformation 

^12.204) 

whose component at C G “^(A), according to (12.138) and (12.199), is a map 

^ i'CtC, [0,1]), (12.205) 

satisfying properties dictated by Definition 12.25. In particular, if C is maximal 
abelian in A, then by the comment preceding the proof, ^ is simply a function 
ff{Z{C)) —>■ [0,1] that satisfies (12.201) - (12.203) and hence is a (regular) proba¬ 
bility measure pc on E{C). Thus by Riesz-Markov one obtains a state coc on each 
maximal abelian C. From the topology on and (12.137) we see that if D is not 
maximal, p^ is determined by p^ for any C D D, so that we also obtain a proba¬ 
bility measure po on E{D), or, equivalently, a state (Od, by restriction of (Oc to D. 
One might fear that po and (Od could depend on the chosen embedding D cC, but 
naturality of p implies that if D C C as well as D C C', where both C and C' are 
maximal, then the ensuing measures po are the same. This implies the same prop¬ 
erty for the corresponding states (Od, which in turn shows that the collection of all 
Pd and pc thus obtained organizes itself into a single quasi-state (O on A. 

The converse follows by running this argument backwards. □ 

Combining (12.196) with Theorem 12.27, we obtain a state-proposition pair¬ 
ing that is no longer probabilistic, as in ordinary quantum mechanics, but defines a 
proposition in the internal language of T(A) and as such may or may not be true at 
each stage C G ^(A). The final ingredient for this is an arrow 

^12.206) 

defined by its components 1(-: L{'\C, [0,1]) that map each open subset of 

to the constant function on fC taking the value 1 G [0,1]. The internal language 
of T(A) (cf. §E.5) turns this into a formula p^ —I with the following interpretation: 
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(12.207) 

We combine this with (12.196) so as to obtain an internal state-proposition pairing 
[[^J a€{p,q)) =l]]QT:l^Q, (12.208) 

where we have abbreviated 

[[M„.( Qe {p,q)) = i]]QT = = i]] o [[ gg (p,g) ]]QT. (12.209) 

The truth of the proposition (12.208) at stage C may be determined from Kripke- 
Joyal semantics; a straightforward computation for A = B{H) shows that 

C\hpj a€ip,q)) = l (12.210) 

iff there exists a projection e € ^{C) with e < ®(^) = 1- Assuming (O is 

a vector state (o{a) — {\i/,a\i/) for some unit vector \j/ G H, this means that ( 12 . 210 ) 
holds iff yf € eH C for some e G ^{C), i.e., if the proposition a G {p, q) has 

(Bom) probability one in state \j/ and there is a yes-no measurement in context C 
verifying this probability. In comparison, in classical mechanics a pure state xGX 
makes a G {p,q) true iff a(x) G {p,q), where a G C(A,K) as before. 

We close this chapter with a topos-theoretical (or, one might say, topological) 
reinterpretation of the Kochen-Specker Theorem, which to some extent explains 
why the previous construction had to use the fuzzy interval domain IK rather than 
the sharp reals K. To this end, we first generalize the notion of a quasi-linear non- 
contextual hidden variable (cf. Definitions 6.1 and 6.3) to any (unital) C*-algebra; 

Definition 12.28. 1. A valuation on a unital C*-algebra A is a unital map 

ViAsa^K ( 12 . 211 ) 

that is dispersion-free (i.e. multiplicative) and linear on commuting operators. 

2. A point in a frame ff{X) in some topos T is defined as a frame homomorphism 

p:ffiX)^Q, (12.212) 


where Q is the truth object in T. 

If A is commutative, the Gelfand spectrum Z (A) consists of the valuations on A. The 
second part generalizes the notion of a point of a frame in set theory (cf. §C.l 1). 

Theorem 12.29. For any unital C*-algebra A, there are canonical bijective corre¬ 
spondences between: 

• Valuations on A. 

• Points ofX{A) in Sh(‘^(A)). 

• Continuous cross-sections G : 'fi’{A) —>■ of the bundle % : -G “^(A). 
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Proof. We first give the external description of points of a locale F in a sheaf topos 
Sh(X) (cf. §E.4). The subobject classifier in Sh(X) is the sheaf Q:Ui-^ ^{U), 
in terms of which a point of F is a frame map Q. Externally, the point- 

free space defined by the frame Q is given by the identity map idx : X —X, so 
that a point of F externally correspond to a continuous cross-section a : X ^ Y of 
the bundle n : Y ^ X (i.e., n o a — idx). In principle, n and a are by definition 
frame maps in the opposite direction, but in the case at hand, namely X = ‘rf{A) and 
F = the map a : ‘rf{A) E'^ may be interpreted as a continuous cross-section 
of the projection (12.134) in the usual sense. Being a cross-section simply means 
that (7(C) € E(C). As to continuity, by definition of the Alexandrov topology, a is 
continuous iff the following condition is satisfied; 

Eor all ^ G ff{E'^) and all C C D, if (j(C) G then a{D) G 

Hence, given the definition of i^{E'^), the following condition is sufficient for conti¬ 
nuity: if C C D, then a{D)\^Q = (7(C). However, this condition is also necessary. To 
explain this, let poc '■ ^(D) ^(C) again be the restriction map. This map is con¬ 

tinuous and open. Suppose Pdc{<^{D)) f C7(C). Since EfO) is Hausdorff, there is 
an open neighbourhood fYo of pf^{a{C)) not containing cy{D). Let fYc = Pdc{^d) 
and take any fY G ff{E^) such that fY (T Y?{E{C)) = fYc and fY (T G{E{D)) = 

This is possible, since YYq and fYo satisfy both conditions in the definition of G(E^'). 
By construction, C7(C) G fY but oiD') ^ fY , so that a is not continuous. Hence a is 
a continuous cross-section of % iff 

Cj(D)|c = (7(C) for all CCD. (12.213) 

Now define a map V ; Asa —?► C by y(fl) = a{C*{a)){a), where C*{a) is the com¬ 
mutative unital C*-algebra generated by a. \fb*=b and [a,f>] = 0, then V{a + b) = 
V(a) +V{b) by (12.213), applied to C*{a) C C*{a,b) as well as to C*{b) CC*{a,b). 
Eurthermore, since (7(C) G E{C), the map V is dispersion-free. 

Conversely, a valuation V defines a cross-section (7 by complex linear extension 
of a{C){a) = V{a), where a G Csa. By the criterion (12.213) this cross-section is 
continuous, since the value V(a) is independent of the choice of C containing a. □ 

Corollary 12.30. The bundle 71: E^ ‘^(■^) (cf Corollary 12.22) admits no con¬ 
tinuous cross-sections as soon as A has no valuations (e.g. if A = Mf,(C), n > 2). 

The contrast between the pointlessness of the internal spectrum E and the spa- 
tiality of the external spectrum E'^ is striking, but easily explained: a point of E'^ (in 
the usual sense, but also in the frame-theoretic sense if E^ is sober) necessarily lies 
in some E{C) and hence is defined (and dispersion-free) only in the context 

C. Eor example, for A = M„(C), a point V € E (C) corresponds to a map 

y* : ^(E^) {0,1}, S hG y(5(C)), (12.214) 

where ^(E^) is given by (12.95). Thus V* is only sensitive to the value of S at C. 
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Notes 

Previous advocates of intuitionistic logic for quantum mechanics include Popper 
(1968) and Coecke (2002). The earliest use of topos theory in quantum mechanics 
was probably by Adelman & Corbett (1995), but the founding papers of the topos 
approach to quantum mechanics as further developed in this chapter are Isham & 
Butterfield (1998), Butterfield & Isham (1999, 2002), and Hamilton, Isham & But¬ 
terfield (2000). This series of papers was predated by Isham (1997) and was fol¬ 
lowed by Doring & Isham (2008abcd, 2010); see also Flori (2013) for an intro¬ 
duction. Wolters (2013ab) gives a detailed comparison between the “contravariant” 
Butterfield-Doring-Isham approach and the “covariant” approach in this chapter. 

The original motivation behind our approach to “quantum toposophy” was the 
Principle of General Tovariance (Heunen, Landsman, & Spitters, 2008), which 
was a pun on Einstein’s Principle of General Covariance underlying General Rel¬ 
ativity (Norton, 1993, 1995). Einstein based his theory of gravity and space-time 
on the mathematical postulate that all equations of physics be invariant under arbi¬ 
trary coordinate transformation, and similarly we proposed that all physical the¬ 
ories should be invariant under so-called geometric morphisms between toposes 
and hence should be formulated in terms of what (confusingly) is called geomet¬ 
ric logic (cf. Mac Lane & Moerdijk, 1992; Johnstone, 2002). Since in fact some 
of our constructions turned out be non-geometric in this sense, we subsequently 
dropped this principle and stopped even referring to the above paper. However, as 
Raynaud (2014) and, more generally, Henry (2015) show, our theory can actually be 
made geometric (in the topos-theoretical sense) provided one puts the entire theory 
of (internal) C*-algebras on a localic (i.e., pointfree) basis, as in Henry (2014ab). 
Other recent developments of the program (which are not discussed here) may be 
found in e.g. van den Berg & Heunen (2012, 2014), Spitters, Vickers, & Wolters 
(2014), Heunen (2014ab), and Heunen & Lindenhovius (2015). 

§12.1. C*-algebras in a topos 

C*-algebras in a topos, including a constructive version of Gelfand duality for 
commutative unital C*-algebras that is valid in arbitrary Grothendieck toposes, were 
first studied by Banaschewski & Mulvey (2000ab, 2006). The topos T(A) and the in¬ 
ternal commutative C*-algebra A were introduced by Heunen, Landsman, & Spitters 
(2009). All these papers rely crucially on the theory of internal locales in toposes, 
which owes much to Johnstone (1982) and Joyal & Tierney (1984). See also John¬ 
stone (1983) and Vickers (2007). It is possible to realize T(A) as the topos of sheaves 
on the locale Idl(‘^(A)), which is the ideal completion of the “mere” poset “^(A), 
but we will not use this description (Raynaud, 2014). 

§12.2. The Gelfand spectrum in constructive mathematics 

This section is based on Coquand (2005) and Coquand & Spitters (2005, 2009), 
where also the missing details may be found. All necessary background on lattice 
theory is provided by Johnstone (1982), except the ingredients for the proof that the 
constructive Gelfand spectrum is compact and regular, which is due to Cederquist 
& Coquand (2000). Proposition 12.10 may be found in Aczel (2006). 
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§12.3. Internal Gelfand spectrum and intuitionistic quantum logic 

This section is based on Caspers, Heunen, Landsman, & Spitters (2009), except 
for the final part on Kripke semantics, which is taken from Heunen, Landsman, 
& Spitters (2012). An interesting philosophical analysis of the intuitionistic logic 
emerging from this program may be found in Hermens (2016), to whom the inter¬ 
pretation elements of the frame as disjunctions is due. 

§12.4. Internal Gelfand spectrum for arbitrary C*-algebras 

This section is based on Caspers (2008), Caspers, Heunen, Landsman, & Spit¬ 
ters (2009), and Heunen, Landsman, (& Spitters (2009). Complete proofs of Lemma 
12.15 and Lemma 12.16 may be found in Caspers (2008), §5.2. For different proofs 
of these lemmas see Heunen, Landsman, & Spitters (2009) and Coquand (2005), 
respectively. A proof of Lemma 12.21 may be found in Wolters (2013b), Theorem 
2.17, also available as http : / / arxiv. org/pdf/1010.2 031v2 . pdf. 

§12.5. “Daseinisation” and Kochen-Specker Theorem 

The spectral order was introduced by Olson (1971) and was rediscovered by De 
Groote (2011). For a devastating critique of Heidegger’s philosophy see Philipse 
(1999). The first construction of a “Daseinisation” map was given by Ddring & 
Isham (2008b). The version presented here is an improvement, due to Wolters 
(2013ab), of a previous adaptation of the Doring-Isham appraoch to the topos T(A) 
in Heunen, Landsman, & Spitters (2009). Similarly, Theorem 12.29, first published 
in Heunen, Landsman, Spitters, & Wolters (2012), is an improvement due to Wolters 
(2013 a) of an earlier result in this direction in Heunen, Landsman, & Spitters (2009). 

The work of Isham & Butterfield (1998), which, as already mentioned, started the 
entire quantum toposophy program, was actually motivated by an topos-theoretica 
reformulation of the Kochen-Specker Theorem. Isham and Butterfield started from 
the following observation. Let ‘rf{B{H)) be the poset of commutative von Neumann 
subalgebras of B{H), partially ordered by set-theoretic inclusion, seen as a category 
in the usual way. Consider the presheaf topos of contravariant func¬ 

tors F : '^{H) Set, where Set is the category of sets. The spectral presheaf is 
the contravariant functor Z defined on objects by Eq(C) = E (C), and by the natural 
map on arrows, that is, Z j (C C D) maps (O G E (D) (which is a map D —^ C) to its 
restriction to C, i.e., to a)|c G ^{C)- A point of some object F in Set] 

is defined as a natural transformation 1 —t F, where 1 is the terminal object, i.e., the 
presheaf that maps everything into the singleton set *. 

The Kochen-Specker Theorem a la Butterfield & Isham, then, states that if 
dim(//) > 2 as usual, the spectral presheaf has no points. 
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Finite-dimensional Hilbert spaces 


Although we assume the reader to be familiar with linear algebra, some of the points 
below may not be emphasized at that level and hence need to be recalled. 

Unless explicitly stated otherwise, all vector spaces (and hence also all algebras) 
are defined over the complex numbers C. Moreover, from §A.2 until the end of this 
appendix, V will be finite-dimensional', the infinite-dimensional case will be treated 
in the next appendix on functional analysis and general Hilbert spaces. 


A.l Basic definitions 

Definition A.l. Let V be a vector space (not necessarily finite-dimensional). 

1. A sesquilinear form on V is a map V x V —>• C, written (v,w) !->■ (v,w) (or, 
occasionally, to distinguish it from an inner product, as (v,w) i—> B(v,w)) that is 
real-bilinear and satisfies (iv, w) = —i{v, w) and (v, iw) = i{v, w) for all v, w,x G V. 

2. A hermitian /orm on V is a sesquilinear form that satisfies (w, v) = (v,w). 

3. A pre-inner product on V is a positive hermitian form, i.e., (v,v) > 0. 

4. An inner product on V is, in addition, positive definite.' (v, v) =0 iff v = 0. 

5. A norm on V is a function || • || : V —>■ satisfying, for all v,w,h G V and X G C.' 

a. ||v-bw|| < ||v|| + ||w|| (triangle inequality); 

b. \\Xv\\ = |A|||v|| (homogeneity); 

c. ||v|| = 0 = 0 (positive definiteness). 

Many analytical arguments in functional analysis are based on the fundamental 
Cauchy-Schwarz inequality, which is satisfied by any (pre-) inner product: 

|(v,w)p < (v,v)(w,w). (A.l) 

Proposition A.2. An inner product on V defines a norm on V by means of 

Ikll = \/(v,v). (A.2) 
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The Cauchy-Schwarz inequality (A.l) then reads 

|(v,w)| < ||v|| ||w||, (A.3) 

with equality ijfv and w are linearly dependent. 

The question arises when a norm comes from an inner product via (A.2). 

Theorem A.3. A norm || • || comes from an inner product through (A.2) iff 

Ik + wf+ ||v-wf = 2(||vf+ ||wf). (A.4) 

In that case, one has the polarization identity 

(v,w) = \{\\v + w\\^-\\v-w\\^ + i\\v-iw\\^-i\\v + iw\\^). (A.5) 

Proof Easy computations show that (A.2) holds, that (w, v) = (v,w), and, with a bit 
more effort, that (v,wi Pwf) = (v, wi) + {v^wf). Now suppose we know that 

(w,sv) = s(w, v) (A.6) 

for certain s G M. Then this property clearly also holds for instead of s. Fur¬ 
thermore, having (A.6) for s as well as t G M implies the same property also for 
s -l-t and St. Starting with s = f = 1, this generates (A.6) for each s G Q. Now if 
s„ ^ s for s„ € Q and s G K, then by continuity and homogeneity of the norm, 
(w,s„v) —>■ (w,sv). Consequently, (A.6) holds for each s G K. Finally, from (A.5) we 
also find (w, iv) = i{w, v), and hence (A.6) holds for each s G C. □ 

There is an analogous result for continuous hermitian forms, with practically the 
same proof (where continuity is once again needed to pass from Q to M). Fet V be 
a vector space with inner product, and let B ; V x V —C be a hermitian form. The 
associated quadratic form Q : V —M, defined by 

e(v)=B(v,v), (A.7) 

then satisfies 

Q{zv) = |z|^0(v) (z G C); (A.8) 

Q{v + w) + Q{v-w) = 2{Q{v) + Q{w)). (A.9) 

Proposition A.4. Let V be a vector space with inner product. A map Q : V —^ K 
that is continuous in the associated norm (A.2) is derived from a hermitian form 
B : H X H ^ C through (A.7) iff Q satisfies (A.8) - (A.9), in which case 

B{v,w) = ^{Q{v + w)-Q{v-w) + iQ{v-iw)-iQ{v + iw)). (A.IO) 
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A.2 Functionals and the adjoint 


In the remainder of this appendix, V is a finite-dimensional complex vector space 
with inner product. Since this is automatically a (finite-dimensional) Hilbert space 
(as defined in the next appendix), we rename it as H. The archetypal example is 
// = C", with elements z = (zi,... ,Zn), G C", and standard inner product 

{Z,w) = (A.ll) 

1=1 

In that case, we hardly make a difference between a linear map a: H ^ H and the 
corresponding matrix (a,;), where {az)i = or, equivalently, 

aij = {Vj,aVj), (A.12) 

where (oi = (1,0,... ,0),... l)„ = (0,... ,0,1)) is the standard basis of C”. More 
generally, we will only consider orthonormal bases of Hilbert spaces H, i.e., bases 
(o,) for which {Vi,Vj} = In fact, in the present (finite-dimensional) case, any 
orthonormal set of n = dim(//) vectors is automatically a basis. Throughout this 
book, the word “basis ” will be synonymous with orthonormal basis. 

Let H* be the vector space of linear maps f : H ^ C, also called (linear) func¬ 
tionals (on H). Since the inner product is positive definite, it is also non-degenerate: 

Proposition A.5. The map \j/ f^t, where 

M(P) = {W,(P), (A.13) 

is an anti-linear isomorphism H — H* (i.e., one has X\f/ Xf^/for any X G C). 

Proof. Injectivity is obvious. For surjectivity, note that coker(/) (i.e., the orthogonal 
complement of the kernel ker(/) of /) is one-dimensional (assuming / is nonzero), 
and take a unit vector \j/ G coker(/). Then \j/ = f{'^)'^ does the job: by linearity of 
/, we have f{(p)\lr — f{\lr)(p G ker(/) for any (p G H (and even any \(f G H), so that 
{W,fi(P)V-fiw)(P) = 0. Since = \\ff = 1, this yields/ =/^. □ 

A linear map a : H ^ H is also called an operator, we denote the algebra of 
all operators on H by B(H). For example, we have B(C”) = M„{C). Two arbitrary 
vectors \i/,(p G H define an operator | V7)(^| through Dirac’s “bra-ket” notation 

\¥){<P\X = {<P,X)¥- (A.14) 

The adjoint a* of an operator a is defined by the property 

{a*\l/,(p) = {\l/,a(p), {\l/,(pGH). (A.15) 

Indeed, for given x (and a), define a functional ^ ://—>■ C by fa,x{^) = 

Then, as we just saw, fa.x = fx)/ for some unique \j/ G H', define a* by a*x = V^- This 
map is linear by construction. 
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Clearly, one has 

a**=a. (A.16) 

For H = C", the matrix corresponding to the adjoint a* is given by the well-known 
formula a*j = Wfi. A more abstract example of an adjoint is given by 

W){(p\* = W){w\- 

The (operator) norm of a : H H is defined by 

||a|| =sup{||aV/|l,rGi/i}. 

where the unit sphere H\ C H is defined by 

Hi={\j/GH,\\\j/\\ = 1}. 

Proposition A.6. One has ||fl|| < °°for any linear map a : H ^ H. 

Proof. Recall that dim(//) = n < °°\ Map H to C" by the choice of some basis 
(Vi). Thus y/ G // is mapped to xp — (i/i,..., i/„) G C”, with i//,- = (n,-, y/), and we 
have IIV /'||2 = Hv^H, where ||z ||2 = L/ kiP is the usual norm on C", which is given 
by (A.2) with (ATI). This also transfers the operator a : H ^ H to a linear map 
fl : C" —>■ C" defined by the matrix (A.12). Then ||fl|| = ||a|| = sup{||dz|| 2 ,z G C"}, 
where C" = {z G C", ||z ||2 = !}• Now a is continuous because it is linear, and hence 
it maps C[ (which is compact by Heine-Borel) to some compact set fl(C") in C”. 
It is easy to see that the norm || • II 2 ^ C” —>■ K+ is continuous, and according to 
Weierstrass the norm therefore assumes a finite maximum (as well as a minimum) 


on any compact set K. Taking K = a(C") proves the claim. □ 

Proposition A.7. Let a,b : H H be linear maps, and let ij/ G H. Then: 

\Wv\\ < l|a|||lrll; (A.20) 

\\ab\\ < \\a\\\\b\\-, (A.21) 

||«1I = ||«||; (A.22) 

||a*fl|| = ||a||2. (A.23) 


Proof. The first two inequalities are immediate from (A.18). Next, if ||v/|| = 1, by 
(A.3), (A.15), and (A.20) we have 

||fl*i/|k = {a*\j/,a*\j/) = {\j/,aa*\j/) < ||v/||||flfl*V^|| < ||a||||fl*i/||, (A.24) 

so ||a*V/|| < ||a||, and hence from (A.18), ||a*|| < ||a||. But (A.16) gives the opposite 
inequality, whence (A.22). Finally, (A.21) and (A.22) yield ||a*a|| < ||a*||||a|| = 
||a|k. From (A.3) and (A.20), on the other hand, we obtain 

llay/'lk = (all/,all/) = ( 1 / 0 * 01 /) < ||o*o||, 
so ||o|k < ||o*o|| by (A.18), and hence (A.23) is proved. 


(A.25) 

□ 


(A.17) 

(A. 18) 

(A.19) 
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A.3 Projections 

The most important examples (and also, as will see shortly, building blocks) of self- 
adjoint operators are projections e : H ^ H, defined by the property 

e2 = e* = e, (A.26) 

Proposition A.8. There is a bijective correspondence e L between: 

• projections e on H; 

• linear subspaces L ofH, 

given by 


L = eH; (A.27) 

e = ^|t;,-)(t;,-|, (A.28) 

i 

where eH = {e\j/, ij/ G H} is the image of e, and (o,) is an arbitrary basis ofL. 

The proof is routine, including the fact that (A.28) is independent of the basis. 
Whenever convenient, we write (A.28) as For example, the “sub”space L — H 
corresponds to eu = 1//, whereas L = {0} corresponds to e|oj = 0. 

Define the orthogonal complement! of subset of Hilbert space of any subset 

LdHhy 

T-'- = {t/G// I =0V^ G L}. (A.29) 

In particular, if L is a linear subspace of H, one easily checks that 

= 1 — ei- (A.30) 


Corollary A.9. For each linear subspace L <zH one has 

H = L(BL^, (A.31) 

in the sense that LDL^ = {0}, and each vector Xj/ G H has a unique decomposition 

y/= y/ll-f-i/r-L, (A.32) 

where G L and G 

Proof Existence of the decomposition is given by 

= clT, (A.33) 

V/-^ = (l-eL)V/. (A-34) 

Uniqueness follows by assuming y/ = +X^ with G L and G one then 

has y/ll — jll = \j/^ — X ^7 but since the left-hand side is in L and the right-hand side 
is in L-'- both sides lie in = 0. □ 
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A.4 Spectral theory 

An eigenvector of an operator a is a nonzero element xj/ G H such that 

a\if = X\if (A.35) 

for some A G C, called an eigenvalue of a. We also define the eigenspace Hi by 

Hx = W ^ H \a\lf = (A.36) 

with associated projection ei (in that Hi = eiH, cf. Proposition A.8). In case that 
dim(//;t) = 1 the eigenvalue X is called non-degenerate (or simple). Otherwise it 
is said to be degenerate, with multiplicity mi = dim(//;i). In linear algebra, the set 
of all eigenvalues of a is called the spectrum of a, denoted by a{a) (for infinite¬ 
dimensional H, this turns out to be the wrong definition of the spectrum, see §B.14). 
We now give two formulations of the spectral theorem for self-adjoint operators. 

Theorem A.IO. Let a be a self-adjoint operator on H. Then cy{a) C K, eigenspaces 
for dijferent eigenvectors X f jX are orthogonal (i.e., eie^ = di^^ei), and 

a= ^ X-ei', 

Xec!{a) 

1// = ex- 

Xe(y{a) 

Equivalently, we may reformulate the above spectral resolution of a in terms of the 
existence of a basis (u,) ofH consisting of eigenvectors of a. In that case, we have 

&m(H) 

a= Y, 

/=1 

1// = E 

1=1 

where Xi is the eigenvalue corresponding to the eigenvector U; (i.e., aVi = XiVi). 

Note that the eigenvalues X occurring in (A.37) are all different, whereas the Xi in 
(A.40) need not be: the number of times an eigenvalue A, G O’(fl) occurs is given 
by its multiplicity. This also implies that the spectral resolution (A.37) - (A.38) is 
canonical (i.e. free of any choices), whereas (A.39) - (A.40) depends on arbitrary 
choices of bases in all subspaces Hi with dimension greater than one. Nonetheless, 
it is easier to prove (A.39) - (A.40), which obviously imply (A.37) - (A.38): just 
collect all Xi that are equal to X and realize that, as in (A.28), one has 

ex= Y 

i\Xi=X 


(A.39) 

(A.40) 


(A.37) 

(A.38) 
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More generally, for some (at the moment) arbitrary (but later: measurable) subset 
4 C M it turns out to be convenient to introduce the spectral projection e\ on H 
and the associated spectral subspace C H: if A D (7(a) = 0 we put ca =0 and 
Ha = {0}, and otherwise, 


^4 = E (A.42) 

AG/lncy(a) 

Ha = caH. (A.43) 

We now prepare for the proof of Theorem A.10. First, note from (A.15) that 

2/-Im((v/,av/)) = = (i/,av/) — {\if,a*\ir). (A.44) 

If a* = a, from (A.35) and (A.44) one obtains Im(A) = 0 and hence (j(a) C M. 

Lemma A.ll. A self-adjoint operator a has an eigenvalue Xforwhich |A| = ||a||. 

Proof. As in the previous proof, the norm || • || assumes a maximum on the compact 
set aH\, where Hi = {\j/ G H, || v/|| = 1}. Suppose this happens at a\j/i, where by 
construction ||v^i|| = 1. By definition of the norm, this maximum must be ||a||, so 
that ||a|| = ||av/i||. Hence, using a* = a, (A.3), and (A.23), we may estimate 

||af = ||av/i|p = (av/i,av/i) = (v/i,aVi) < h^WiW < = l|af ■ (A-45) 

Hence we need equality at the < sign in (A.45), which according to the remark 
below (A.3) can only be the case if a^V^i = ||a|pv/i. Define Xi = ~ llall’/i- 

There are two possibilities: if Xi = 0, then axj/i = ||a|| y/i, and Xi ^ 0, then 

axi = aVi - ||fl||ari = ||a||Vi - \\a\\a\l/i = -||a||xi. (A.46) 

Hence either a\j/i = ||a|| y/i or aXi = — ||a||xi, which proves the claim. □ 

We are now in a position to prove Theorem A. 10. 

Proof By Lemma A. 11, we already found one eigenvector Ui of a, viz. either Ui = 
xj/i or Hi = Xi- Furthermore, it is easy to show that if a self-adjoint operator a leaves 
a linear subspace L<Z H stable (in that a(p G L whenever (p G L), then it also leaves 
stable, and remains self-adjoint as an operator a : ^ L^. First use this with 

L\ = C • Hi. Lemma A.l 1, now applied to a : L^, gives a second eigenvector 

H 2 . Now take L 2 to be the linear span of Hi and H 2 , and restrict a to L^, etc. Since 
H is finite-dimensional, this procedure ends after dim(//) steps. 

This leaves us with a basis (h,) of H that by construction entirely consists of 
eigenvectors. The mutual orthogonality of these eigenvectors (and hence of the spec¬ 
tral projections ex) follows from a simple calculation. □ 

Corollary A.12. The norm of a self-adjoint operator a is given by 

||a|| = sup{|A|,A G (j(a)}. (A.47) 
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Proof. This rapidly follows from Theorem A. 10 by expanding \j/ in (A. 18) with 
respect to the basis of H given in (A.39) - (A.40). □ 

Corollary A.13. A self-adjoint operator a is a projection iff: 

• O'(a) = {0}, in which case a = 0; 

• O’(fl) = {!}, in which case a = 1; 

• o{a) = {0,1}, in which case a is called a proper projection. 

In particular, if e is a nonzero projection, then 

Ikll = 1- (A.48) 

Proof. Only the third case is nontrivial. If a = e is a proper projection, then by 
Corollary A.9 its eigenvectors can only lie in L = eH (with eigenvalue A = 1) or 
in = (1 — e)H (with eigenvalue X = 0). The converse implication follows from 
Theorem A.10, notably from (A.37). Eq. (A.48) then follows from (A.47). □ 

A less elementary but more powerful approach to the spectral theorem is as fol¬ 
lows. For the notion of a C*-algebra see Definition C.l in Appendix C. 

Definition A.14. Let a G B{H). Then C*{a) is the C*-algebra generated by a and 
\h (i.e., the algebra of all polynomials in a). 

Theorem A.15. If a is self-adjoint, then C*{a) is commutative, and: 

1. There is an isomorphism of (commutative) C*-algebras 

C{a{a))^C*{a), (A.49) 

written f i—>■ f{a), which is unique if it is subject to the following conditions: 

• the unit function la(a) ■ X I corresponds to the unit operator In; 

• the identity function id^y^^j : X ^ X is mapped to the given operator a. 

2. In terms of the spectral projections ex of the operator a we have 

C*{a) = C*{ex,X G O’(fl)) = span(e;^_,A G O’(fl)), (A.50) 

where the middle term is the C*-algebra generated by the projections ex. 

3. Under the isomorphism (A.49), 


ex = h{a), (A.51) 

where the delta-function dx' on (7{a) is defined by 5^' : A 5xx'- 
Proof. For any complex (finite) polynomial p(x) = c„x" on R, define an operator 

p{a) =Y,Cna''. (A.52) 

n 

Simple computations then show that, for arbitrary polynomials p, and f G C, 
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{tp + q){a) =tp{a)+q{ay, (A.53) 

{pq){a) = p{a)q{ay, (A.54) 

p{a)* = p{a). (A.55) 


Hence the space P*{a) of all such polynomials in a forms a *-algebra of B{H). As 
a linear subspace of the finite-dimensional vector space B{H), P*{a) must itself be 
finite-dimensional, hence it is C*-algebra. Moreover, P* (a) clearly contains 1// (take 
p(x) = 1) as well as a (take p(x) = x), and since P* (a) C C* (a) by definition of the 
latter, we must have P*(a) = C*(a). Since pq = qp and hence p{a)q{a) = q{a)p{a) 
by the above computations, it follows that P*{a) and hence C*{a) is commutative. 
This proves the first claim. 

To establish the isomorphism (A.49), we are going to define a map 

C{a{a))^f^f{a)eC*{a). (A.56) 

We initially do this for polynomials / = p, so that f{a) = p{a) is defined by (A.52). 
Since C*{a) = P*{a) consists of polynomials in a, the map (A.56) is evidently sur¬ 
jective. It is also injective, for suppose p{a) = q{a). Applying this to an eigenvector 
Vx G Hx yields p{X) = q{X), for each X G <j(a), and hence p = q as functions 
on a(a). Hence / i— f(a) is, at least, a bijection of sets. Moreover, the properties 
(A.53) - (A.55) turn it into an isomorphism of C*-algebras, evidently with the prop¬ 
erties stated after (A.49). Finally, for any given function / ; a{a) C there exists 
some polynomial p that coincides with / on the finite set (7(a) C M, so that we may 
define /(a) in (A.56) by p{a), as in (A.52); by the above proof of injectivity, the 
ensuing operator f{a) is independent of the choice of p. 

We prove the last two claims, using the orthogonality property exe^ = dx^ex of 
spectral projections and the defining properties e\=ex= ex of general projections, 
see (A.26). From eq. (A.37) in Theorem A. 10 we obtain (for polynomials /): 

/(fl)= Y. (A.57) 

Xe(y{a) 

If we now define C*{a)' as the linear span of the spectral projections ex and 1// 
(which is a unital commutative C*-algebra by the properties of the ex just men¬ 
tioned), then (A.57) shows that C*(a) C C*{a)'. Conversely, (A.57) gives (A.51), 
which shows that C*{a)' QC*{a), and hence C*{a) =C*(a). □ 

A second approach to the final claims of Theorem A. 15 is more ambitious, as it 
includes a derivation of Theorem A.10 (instead of assuming it, as we just did). We 
now use (A.51) to define the spectral projections ex', from (A.54) - (A.55) we have 

4 = hiaf = ^xi^) = h{a) = ex', 
e*x = h (a)* = Sx (a) = 5x (a) = , 

showing that ex is indeed a projection. Also note the following identities in C((7(a)): 
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idoW= E (A.58) 

X^o{a) 

loW= E (A.59) 

Transferring these from C((7(a)) to C*{a) via the isomorphism (A.49) then yields 

(A.37) - (A.38). To analyse the projections ex defined by (A.51), we first compute 

exeii = 5x{a)5f,{a) = (SxS^)(a) = dx^Sx(a) = dx^ex, (A.60) 


which shows that the ex are mutually orthogonal. Second, we compute 

aexW = a5x{a)\j/^ id^(„) {a)5x (a )W = (id„(a) ■ 5x)ia)\j/ = X ■ 5xia)\j/ = XexW, 

which shows that exH C Hx ■ Third, (A.60) and (A.59) give (Bxec!{a)^xH — H, which 
together with the second step gives exH — Hx - Hence the ex are indeed the spectral 
projections of a. Since we have already proved (A.37) - (A.38), we conclude that 
Theorem A. 10 follows from the first part of Theorem A. 15. By the argument in the 
main proof above, this first part then also yields the second part. 

The generalization of Theorem A. 15 to a family a = . ,a„) of commuting 

self-adjoint operators is as follows. 

Definition A.16. Let a = (ai,..., be commuting self-adjoint operators. 

1. A joint eigenvector of a is a nonzero vector \j/ G H such that a\j/ = Xxj/, where 

A = (Ai,... ,A„) with Xi G C, i.e., for each i = one has ai\j/ = XiXjf. We 

call X a joint eigenvalue of a. 

2. The joint spectrum (y{ai,... ,a„) = (7(a) consists of all Joint eigenvalues of a. 

3. C*{a) is the smallest unital C*-subalgebra ofB{H) that contains each a,-. 

Clearly, we have 

(j(a) C (j(ai) X • • • X (7(a„) C M”. (A.61) 

Furthermore, since dim(//) < once again C*{a) is just the algebra of complex 
polynomials in all operators a,. 

Theorem A.17. Lef a = (ai,...,a„) be commuting self-adjoint operators on H. 
Then the C*-algebra C*(a) generated by these operators is commutative, and: 

1. There is a unique isomorphism of C*-algebras 

C{a{q))^C*{q), (A.62) 

written f i—>■ {f{q), subject to the following conditions: 

• the unit function l(j{a) : A i-A 1 corresponds to the unit operator 1//; 

• the coordinate function Tti: Xi-G- A,- is mapped to at, for each i = 

2. In terms of the spectral projections e'^^ of the operators a,-, we have 

C* (a) = C *,/ = 1,...,n, A; G (j(a,)). (A.63) 
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^x„ ’ 


3. If for each A G O’(fl) we define the operator 

(A.64) 

then ex is a projection, in terms of which the joint spectrum may be rewritten as 
0’(«) ={k^ (y{ai) X • • • X C7(a„) I ex 0}. (A.65) 

4. Finally, we have 

C*(a) = C*(e;L,A G O’(fl)) = span(e;^_,A G a{a)). (A.66) 


We will not prove this in any detail, as the reasoning is quite analogous to the proof 
of Theorem A. 15; for example, in (A.56) one just has to replace a by a. The only 
nontrivial point is that since all a, commute, so do all their spectral projections 
this follows from (A.51), which makes these operators elements of the commuta¬ 
tive C*-algebra C* (a) (which by definition contains each C* (a,) and, in fact, is just 
the smallest C*-algebra in B(N) with this property). Using (A.38) for each a, and 
multiplying the n versions of the unit 1 // thus obtained with each other, yields 

H= 0 (A.67) 

^€(7(a) 


Since p(a)vx = p(^)vx for each joint eigenvector Vx G Hx, eq. (A.67) gives injec¬ 
tivity of the map (A.56) (mutatis mutandis) by the same argument as for n = 1. 

This leads to a multi-spectral theorem for the commuting family a, which is most 
conveniently stated in the following form. First, for any polynomial 

p(xi,...,Xn) = Y, (A.68) 

ki,...,kn 

in n real variables, we generalize (A.52) to 

Pia)= Y «i‘(A.69) 

ki,...,kn 

Theorem A.18. Let a = (ai,...,a„) be commuting self-adjoint operators on H. 
Then for any polynomial p in n real variables, with associated operator (A.69), 

Pid) = Y (A.70) 

Xea{a) 

where the spectral projections ex are given by (A.64). 

The special case p{x\ ,...,x„) then recovers (A.67). As for n= 1, eq. (A.70) may be 
generalized to arbitrary continuous functions f{xi ,...,x„), either by replacing / by 
a polynomial that coincides with / on the joint spectrum (7(fl), or by approximating 
/ by polynomials on some compact set K containing (7(a). 
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Proposition A.19. Let a = (ai,...,««) be a family of commuting self-adjoint oper¬ 
ators on H. Then there is a self-adjoint operator a G B{H) such that C* (a) = C* (a). 

Proof Take a = Y.Xeaia) with all different from each other. Then 

C*{a)=C*{e^,Xea{a)), (A.71) 

by (A.50), and hence the claim follows from (A. 66 ). □ 

Corollary A.20. Every (unital) commutative C*-algebra C in B{H) is generated by 
a single self-adjoint operator a (and the unit Iff), i.e., C = C*{a). 

Proof Just take a basis (ck) of C as a vector space and decompose Ck= ak-\- ia'j^ 
with af and self-adjoint (namely, fljt = 5 (q -Fc^) and = —ji(ck — c^)). If C is 
to be commutative, each Cf must be normal, i.e., c^Cf = Cfcf which is equivalent 
to commutativity of a/, and a'j^, and all Cf must commute, i.e., all a^ and must 
commute for different k. Hence C = C*{af,a'jf), which is of the form C*{a) for an 
appropriate family a, and so by Proposition A.19 it takes the form C* (a). □ 

We say that a unital commutative C*-algebra C C B{H) is maximal if it is not 
contained in some bigger unital commutative C*-algebra in B{H). Also, we call a 
self-adjoint operator a maximal iff (7(a) has cardinality dim(//), or, in other words, 
if each eigenvalue of a is nondegenerate. In finite dimension it is easy to classify 
maximal unital commutative C*-algebras in B{H) up to unitary equivalence. 

Here we say (as usual) that a linear map u : H H' is unitary when it is invertible 
and satisfies {utpjUXj/Y = {(P,W) for each (p,\i/ G H (note that the inverse is 
automatically linear). Two *-algebras C C B{H) and C' C B{H') are called unitarily 
equivalent, then, if there is a unitary map u: H ^ H' such that C' = uCu^K 

Theorem A.21. A unital commutative C*-algebra C C B(H) is maximal iff it is uni¬ 
tarily equivalent to the algebra D„(C) of all diagonal matrices on H' = C”. 

Proof First, Dn{C) is indeed maximal abelian in Mn{C)-, any extension of Dn{C) 
would have to contain some additional matrix b G M„{C) that commutes with all 
a G Dn{C), but by elementary linear algebra this very property implies b G D„(C). 

By Corollary A.20, we have C = C*{a), where a* = a. Then C is maximal iff a 
is maximal. For if not, some eigenvalue X' G ( 7 (a) would have multiplicity mx/ > 1, 
and hence the corresponding spectral projection ex' could be decomposed as ex' = 
e^^} +e^^, where both terms are orthogonal and hence commute. We could then 

extend C*(a), as in (A.50), to C*(ex,^^,e^^,X G (7(a),A f X'), which remains 
commutative, and we have a contradiction with the alleged maximality of C*(a). 

Thus a is maximal, in which case we list the spectrum as (7(a) = {Ai,... ,X„}, 
with corresponding eigenvectors {vx^, ■ ■ ■ ,Vx„}. This gives rise to a unitary map 
u:H defined by uVx- = Vi, where (ui,..., D„) is the standard basis of C”, and 
clearly uau^^ = diag(Ai,... ,A„). If (as is the case) all entries Xi G M are different, 
any (zi,...,Zn) G C" may be written as Zi = p{Xi), i = where p is some 

complex polynomial p{x) = LfCix”, x G M, c,- G C. Hence uC*{a)u^^ = D„(C). □ 
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A.5 Positive operators and the trace 

Operators a: H H satisfying one (and hence all) of the conditions in the next 
proposition are called positive, written a > 0 or 0 < a. More generally, we write a < 
biS b — a>Q. Positive operators play a very important role in quantum mechanics. 

Proposition A.22. The following conditions on an operator a are equivalent: 

1. {\if,a\ir) > Ofor arbitrary G H. 

2. a* =a and cy{a) C 

3. a = for some self-adjoint operator c. 

4. a = b*b for some operator b. 

Proof 1 —?► 2: Putting (v/,av/) > 0 in (A.44) gives {\if,a\ir) = (v/,a*v/) for all y/. 
But for any operator b and vectors ^ G //, as in (A. 10) we have the identity 

MX,b<p) = {X + (p,b(x + (p))-{x-(p,b(x-(p)) 

+ i{X-i(P,b{x-i(p))-i{X + i(P,b{x + i(P))- (A.72) 

So b — 0 iff {\jf,b\lf) — 0 for all y/ G //, and hence condition 1 implies a* = a. We 
therefore know that a (a) C M, and since an eigenvalue A < 0 would contradict the 
first condition 1, the second condition follows. 

2—^3: define c = yj a, where (since 0) the square root is (well) defined by 

dim(//) 

(A.73) 

/=1 

3 —>■ 4 is trivial (take b = c), as is 4 —>■ 1, since {\if,a\ir) = □ 

Combining this with Proposition A.5, we obtain the following result. 

Proposition A.23. The relationship {(p,\j/y = {(p,a\j/) gives a bijective correspon¬ 
dence between (hermitian/positive) sesquilinear forms {■,■)' on H and (hermi- 
tian/positive) operators a on H. 

Proof One direction is trivial. For the other, fix j G // and define a functional 
fiw) = iXiVY- Proposition A.5, f = ftp for some unique (p G H. Define an 
operator b : H -G H hy bx = W put a = b*. □ 

Proposition A.24. Any self-adjoint operator a G B{H) has a decomposition 

a = a+ —a_, (A.74) 

where a± > 0. These are unique if they also satisfy a+fl_ = a_a+ = 0. 

Proof Using Theorem A. 10, we may take 

a± = ± ^ A-a- (A.75) 

AG(T{a)nR=‘= 
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Equivalently, we may use Theorem A.15 to rewrite (A.75) as 


a± = (|id<j(fl)l • lR±)(a) = /±(a), (A-76) 

where is the function X i—>■ |A|, K+ = [0,°°) and = (—oo,0). To prove 

uniqueness, we note that since a (a) C M is finite, there is a polynomial p such that 
/+ = p, and hence «+ = p(a). If a = a'^ — a'_ with a'^ > 0 and a'j^a'_ = a'_a\ = 0, 
then for any polynomial p we have p{a) = p(a\) + p{—a'_). For the one just taken, 
this gives p{a) = a'^ by positivity of the a'^, and hence a'^ = «+, etc. □ 

We now introduce a construction of great significance to quantum mechanics. 

Lemma A.25. If (u, ) and (d/) are bases ofH, then for any operator a : H —¥ H, 

Y,{Vi,aVi) =Y,{v'i,av'i). 
i i 

Proof A simple computational proof uses the identity (A.40) for any basis (u,) (i.e., 
the Vi need not be eigenvectors of a, as in (A.39)). Then, as in physics books, 

J^(v',av'} = J^(vii,v'}(v',vj}(vj,av,,} = '£(Vii,Vj}(Vj,aV/,} = '£(Vi,aVi}n 

i ij,k j,k i 

This lemma allows us to define the trace of a by 


Tr(a)=^(t;,-,at;,), (A.77) 

i 

where (u,) is any basis of H. By almost the same proof as Lemma A.25 we obtain 

Ti{ab)=Y,{'Oi,aVj){Vj,bVi) =Y^{Vi,bvf){Vj,aVi) = Tr{ba). (A.78) 

ij i,J 

If u is unitary (in that uu* = u*u= 1,) then from either Lemma A.25 or eq. (A.78), 

Ti{uau*)=Ti{a). (A.79) 

Finally, if a* = a, then (A.37) and taking the trace over the basis in (A.39) yields 

Tr(a)= Y. '^^■80) 

X€o{a) 

Definition A.26. A density operator is a positive operator p on H such that 

Tr(p) = l. (A.81) 

The analysis of density operators hinges on the introduction of a second operator 

norm, beside the canonical one (A. 18). In finite dimension these norms are equiva¬ 
lent, but in general they are not, and it makes sense to introduce both already here. 
For any a G B{H), the operator a*a is positive and hence self-adjoint, so that 
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a*a= Y. = (A-82) 

^€o{a'"a) i=l 

for certain eigenvalues /r, > 0 (including possible multiplicities) or /r G a{a*a) (ex¬ 
cluding multiplicities), all necessarily non-negative by positivity of a* a, and some 
normalized eigenvectors u, or spectral projections cf. (A.37) - (A.39). Then put 

ll«lli= Y = (A.83) 

li€G{a*a) i=l 

It is not immediately clear that || • || i is a norm on B{H), but we will shortly prove 
that it is; we provisionally refer to B{H), equipped with the norm (A.83), as Bi (H). 
Another way to defined this trace-norm is to first introduce the absolute value 

\a\ = s/a*a (A.84) 

of any operator a G B{H), where the square root is simply defined as 

_ n 

v^= Y = E 

li€o{a'"a) i=l 

which coincides with f{a*a) for f{x) = ^/x as defined in Theorem A. 15, see (A.57). 
If a is positive, then \a\ = a. Some other useful properties of the absolute value are 

ker |fl| = ker fl = (ran|a|)^; (A. 86 ) 

|||a|v/|| = ||av/||, r G//. (A.87) 

For the first equality in (A. 86 ), 


ay/ = 0 a*a^f = 0 s/cFa^ = 0 |a|y/ = 0, 

but also a*a\if = 0 ^ (V/,a*aV/) = 0 4^ = 0^ ax^r = 0. For the second, 

kera = (rana*)^, (A.88) 

which in turn is immediate from the definition of the adjoint. Eq. (A.87) is similar. 
Though once again lacking transparency as a norm, by construction we now have 

||fl||i=Tr(|fl|), (A.89) 

so if (Xi) are the (positive) eigenvalues of \a\, including multiplicities, then 

||a||i=E^i- (A.90) 

!=1 

To obtain suitable estimates for the trace norm we need some further techniques. 
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Definition k.-'ll. Let H be a finite-dimensional Hilbert space. 

1. A partial isometry is an operator u S B{H) for which u*u = e is a projection. 

2. A unitary is an invertible partial isometry. 

For immediate and later reference, we collect some properties of such operators. 
Lemma A.28. Let H be a Hilbert space with a partial isometry u G B{H). 

• Also u* is a partial isometry, or, equivalently, uu* = f is a projection. 

• The kernel ofu is {pH)^, and its range is fH. 

• The given partial isometry u is unitary from eH to fH. 

• Conversely, an operator v on H for which there is a (closed) subspace L <zH on 
which V is isometric, whilst it is identically zero on L^, is a partial isometry. 

• Ifu f 0, then ||m|| = 1 . 

• An partial isometry u is unitary iff u*u — uu* = 1// (i.e., e = f = \h}. 

The proof is an easy verification. In the infinite-dimensional case, a distinction arises 
between isometries (i.e, injective partial isometries, so that u*u = 1 h) and unitaries, 
but if dim(//) < oo, injectivity implies subjectivity and hence bijectivity. 

We now come to von Neumann’s highly convenient polar decomposition of an 
operator, which mimics the polar decomposition z = rexp(;^) of z G C. 

Proposition A.29. For a G B{H), assumed nonzero, the operator u given by 

u\a\\ir = a\if, (|a|v/G ran|a|); (A.91) 

u\j/= 0, (v^ G (ran|a|)^ =ker|a|) : (A.92) 


1. Is well defined; 

2. Is a partial isometry (and hence has norm ||m|| = 1); 

3. Is unitary from ran \a\ to rana (if(Aim{H) = take closures (ran |a|)^, (rana)^j; 

4. Satisfies 


= llarll; (A.93) 

= \a\ = \a\u*u. (A.94) 

Given that u is a partial isometry, it is characterized by the two properties: 

ker u = ker a; (A.95) 

a = u\a\. (A.96) 

Furthermore, if a f 0, then a is invertible iff u is unitary. 

Proof. This follows from (A. 86 ) - (A.87), except the claim that (A.95) - (A.96) 
uniquely define u, which we will not use and whose proof we therefore omit. □ 

Recall from the easy Theorem 2.7 that there is a bijective correspondence be¬ 
tween linear maps (O ; B{H) C and operators p G Bi (H), given by (2.33), i.e., 

co(fl) = Tr(pfl). (A.97) 
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Proposition A.30. IfH is finite-dimensional, the map (O rA pfrom B{H)* to B\(H), 
defined by (A.97) gives an isometric isomorphism of Banach spaces 


B{H)* ^BiiHfi 


(A.98) 


in particular, one has 


(A.99) 


Proof. Bijectivity being known already, the basic estimate towards (A.99) is 


|Tr(pa)|<||p||i||a||. (A.IOO) 

This follows from the polar decomposition p =u\p\ and the spectral decomposition 

m<n 

\P\= (A.lOl) 

i=l 

where p,- > 0 (but not necessarily Y^tPi = !)■ Assuming p fO, using (A.lOl), (A.78), 
Cauchy-Schwarz, (A.20), (A.21), ||m|| = ||u,|| = 1, and (A.90), we indeed have 

|Tr(pa)| = |Tr(M|p|a)| = |Tr(|p|flM)| = \Y^pi{Vi,auVi)\ (A.102) 

i 

< <52p,-||fl||||M||||fi|| = ||p||i||a||- (A.103) 

i i 

To prove saturation of this bound, take a = u*, which is isometric on the space 
ran|p| = span(i)i,... ,Vm) and hence satisfies ||a|| = 1 as well as {x)i,auVi)=\. Con¬ 
sequently, from (A.102) we find |Tr(pfl)| = 'E.iPi- By (A.90) for p instead of a, i.e.. 
UpII 1 = Tr(|p|) = we obtain |Tr(pfl)| = ||p||i, which yields (A.99). □ 

Corollary A.31. The trace-norm || • || i is (indeed) a norm on Bi (H). 

As explained in more detail in §B.9, for any vector space V with norm, with double 
dual V**, we have a canonical map V V** given by v i-A v, where 

v(0) = 0(v), (A.104) 


where v G V, v G V**, and 0 G V*. By the general theory, this map is always isometric 
(and hence injective), and if V is finite-dimensional, it is also surjective and hence 
an isomorphism. Therefore, taking V = B{H), we infer from (A.98) that 

Bi{H)*^B(H), (A.105) 

where a G B{H) corresponds to d G (//)* by means of 

fl(p)=Tr(pa). (A.106) 

This new role of B(H) as the dual of B\{H) also equips it with a new topology 
(besides the norm topology it already has), viz. the accompanying w*-topology. 
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This topology is defined by saying that an ^ a iff d„(p) a{p) for each p G 
Bi (H). For historical reasons this is called the a-weak topology on B{H), so we say 
that Qn—^ a a-weakly in B{H) iff Tr {pa„) Tr (pa) for each p GB\(H). 

To close, it is interesting to ut the trace-norm into a classical perspective. As ex¬ 
plained in Chapter 1, at least on finite-dimensional Hilbert spaces, density operators 
are the quantum counterparts of probability measures (or distributions). If A is a 
finite set, the associated function space C{X) carries the supremum-norm 

ll/ll~ = sup{|/(x)|,xGA}, (A.107) 

cf. (1.24). We equip the space C{X)* of all linear maps 0 ): C(A) —C with the norm 

lltdll =sup{|a)(/)|,/GC(A),||/|U = 1}. (A.108) 

Let L* (A) be the vector space of all functions p : A —C, equipped with the norm 

IIpIIi = EIpWI- (a.109) 

xex 

As in the quantum case just discussed, even for finite A it is not immediate that this 
expression indeed defines a norm; this follows from the next proposition. 

Each p G L* (A) defines a linear map co : C(A) —C by 

®(/) = EpW/W- (A.IIO) 

jcGX 

Conversely, each O) G C(A)* defines an element p G L* (A) by 

p{x) = co{5,), (A.lll) 

with 5x GC{X) defined by 5^(y) = 5xy as usual. 

Proposition A.32. IfX is finite, the map co n- p from C(A)* to L^{X), defined by 
(A. Ill), has inverse (A. 110) and gives an isometric isomorphism 

C{xy^L\x) (A. 112 ) 

of Banach spaces; in particular, one has 

ll®|| = ||p||i. (A.113) 

Proof The vector space isomorphism in question can be checked effortlessly. To 
verify (A.113), note that trivially |fi)(/)| < ||p||i||/||oc, whence ||a)|| < ||p||i. To 
show saturation of this bound, given p G L' (A) take f{x) = |p (x) | /p (x) if p (x) f 0 
and /(x) = 0 elsewhere; if p fO this gives ||/||»o = 1 and |ft)(/)| = ||p||i. □ 
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Notes 

The material in this appendix has been collected from numerous functional analysis 
books (some of which are mentioned in the Notes to the next appendix), adapted to 
the finite-dimensional case. Though not used in preparing this text, Halmos (1958, 
1970) are classics. Theorem A.3 is due to Jordan & von Neumann (1935); Amir 
(1986) contains many other characterizations of inner product spaces. 





Appendix B 

Basic functional analysis 


This appendix contains all technical information on general Hilbert spaces (as op¬ 
posed to the finite-dimensional ones of the previous appendix) and, more generally, 
infinite-dimensional Banach spaces, that is either directly needed in the main text, 
or forms necessary preparation for the next appendix on operator algebras (which in 
turn play a central role in this book). Since most interesting examples of both Hilbert 
spaces and more general Banach spaces require some measure theory, which at the 
same time provides the mathematical foundation of probability theory, we include 
a brief introductory overview to this area as well (restricted, though, to the case we 
need, viz. measures and integrals on locally compact spaces). 

Functional analysis has its roots in both mathematics and physics. In particular, 
the general area of spectral theory, which emerged during the period 1900-1930 
in the hands of Hilbert and his school, largely owes its existence to mathematical 
physics, as well as to Hilbert’s genius in finding the right combination of examples 
and abstract theory (including his innovative definition of the spectrum). Hilbert’s 
school culminated in the books Methoden der mathematischen Physik by Courant 
and Hilbert (1924), Gruppentheorie und Quantenmechanik by Weyl (1928), and 
Mathematische Grundlagen der Quantenmechanik by von Neumann (1932), all of 
whom were at Gottingen at the time (as were such giants in the history of quan¬ 
tum mechanics like Born, Heisenberg, and Jordan). Whereas Courant & Hilbert 
at least thought they described classical physics (although it soon turned out that 
their discussion of eigenvalue problems paved the way for the Schrodinger equation 
discovered two years later), von Neumann explicitly developed the Hilbert space 
formalism in order to describe quantum physics (for example, the modem abstract 
definition of a Hilbert space was his), as did Weyl (in connection with group theory). 

What seems to have come from pure mathematics, though, is the idea, central to 
functional analysis, of looking at functions as points in some (infinite-dimensional) 
vector space. This emerged from the French school of Hadamard and his student 
Frechet, requiring considerable interaction between the (then) new fields of linear 
algebra and topology. Eventually, this also led to the fundamental work of Banach. 

We hope that the combination of logical setup, examples, theorems, and proofs in 
this appendix helps convince the reader of the sober elegance of functional analysis. 
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B.l Completeness 

A notable difference between finite-dimensional vector spaces with norm and 
infinite-dimensional ones is that the former are always complete in a sense to be 
defined now, whereas the latter may or may not be. This distinction has major con¬ 
sequences, especially where idealizations (and hence limits) are concerned. 

As before, all vector spaces are defined over C (unless stated otherwise). 

Definition B.l. Let V be a vector space (or, more generally, a set). 

A metric on V is a function d :V xV ^ satisfying, for all f,g,h£ V: 

1- d{f,g) < d{f,h)+d{h,g) (triangle inequality); 

2- d{f,g) =d{gj) for all f,gGV (symmetry); 

3. d{f,g)=Oifff = g (positive definiteness). 

Our main example is a vector space V with norm || • ||, which, as an easy exercise 
shows, gives rise to a metric on V via 

dif,g) = \\f-g\\. (B.l) 

In particular, an inner product on V induces a metric on V through (A.2) and (B.l). 

The reader should have some experience with metric spaces from an undergrad¬ 
uate Analysis course, but for convenience we repeat the definition of completeness. 

Definition B.2. 1. Let (v„) = {v„}„gp^ be a sequence in a metric space {V,d). 

VTe say that v„ —>■ v for some v GV when lim„^oo d{v„,v) =0, or, more precisely: 
for any e > 0 there A A G N such that d{v„,v) < e for all n > N. In a normed 
space, this means that v„ —>■ v ifflimn^oo || v„ — v|| = 0. 

2. A sequence (v„) in {V,d) is called a Cauchy sequence when d{vn,Vm) 0 when 
n,m ^ °o, or, more precisely: for any e > 0 there A A G N such that d{v„,Vm) < 
e for all n.m > N. In a normed space, this means that (v„) is Cauchy when 
\\vn — Vm\\ —>■ 0 for n,m^ oo, in other words, when lim„ „,^oc \\vn — Vm\\ = 0. 

3. A metric space {V,d) is called complete when every Cauchy sequence in V con¬ 
verges (i.e., to an element ofV). 

A convergent sequence is Cauchy: from the triangle inequality and symmetry one 
has d{v„,Vm) < d{vn,v) -\-d{vm,v), so for given e > 0 there is A G N such that 
d{v„,v) < e/2, et cetera. However, the converse statement does not hold in general: 
for example, take the vector space fc(N) of all functions / : N —>■ C that are zero 
expect at finitely many places (with the obvious pointwise operations), or, equiva¬ 
lently, the vector space C” of all sequences (x„) with finitely many nonzero entries. 
This vector space is incomplete in any conceivable norm, like the sup-norm 

ll/ll~ = sup{|/(x)|,xGN}. (B.2) 

Indeed, the sequence (/„), where /«(x) = 1/x forx= and f{x) = 0 forx > n, 

which corresponds to the sequence (1,1/2,1/3,..., l/n,0,0,...) in C°° is Cauchy, 
but its obvious limit f{x) = 1/x for each x G N, or = 1/n, does not lie in f'c(N). 
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Definition B.3. • A Banach space is a vector space with norm that is complete in 
the associated metric (B.l). 

• A Hilbert space is vector space with inner product that is complete in the associ¬ 
ated metric (B.l), in which the norm is defined by (A.2). Equivalently, a Hilbert 
space is a Banach space whose norm comes from an inner product via (A.2). 

As we have seen, £c(N) fails to be a Banach space in the sup-norm, but (its comple¬ 
tion) f‘”(N), which consists of all bounded functions / : N —>■ C, is (see §B.2). 

Definition B.4. 7vvo norms || • || and || • ||' on the same vector space V are equivalent 
if there are constants M > 0 and m > 0 such that for any v GV, 

m\\v\\'<\\v\\<M\\v\\'. (B.3) 

In that case, the two metric topologies on X defined by these norms coincide, so that 
in particular completeness and convergence in || • || and || • ||' are the same. 

Proposition B.5. Let V be a finite-dimensional vector space. All norms on V are 
equivalent, and hence V is complete in any norm. 

Proof. We derive this from a basic fact of Analysis, namely that C” is complete in 
the (Euclidean) norm 11-112 derived from the standard inner product (A.l 1), that is, 

Ikll2 = t (B-4) 

!=1 

So the first step is to transfer the problem from V to C”, where n = dim(y), by 
choosing a basis (u,) of V, and mapping u, to the standard basis vector m, of C". 
Linear extension then maps v = J^iZiVi G V to z = {zi,... ,Zn) G C”, which gives an 
isomorphism V -G C". This maps endows C" with a new norm ||z|| = ||v|| (i.e. the 
given norm on V), which we now prove to be equivalent to || • ||2 = || • . The second 

inequality in (B.3) easily follows from Cauchy-Schwarz, viz. 

ikii= \\j:ztu.\\<i^\zt\M < 

This inequality, together with the elementary but extremely useful estimate 

|||v|| - ||w||| < ||v-w||, (B.5) 

which is valid for any norm in any dimension, implies that the function || • || : C" —K 
is continuous with respect to the Euclidean metric on C”. Now the unit ball C” = 
{x G C" I ||x ||2 = 1} in C” is compact, so according to Weierstrass, the norm || • || 
assumes a minimum on C”. Hence there exists /r G C” such that ||/r|| < ||z|| for all 
z G Cj. Eor arbitrary nonzero z G C", the rescaled vector z! = z/||z ||2 lies in C", so 
11 ft 11 < which is nothing but the first inequality in (B.3) with m = ||/r||. □ 
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B.2 IP spaces 

The simplest examples of infinite-dimensional Banach spaces are the f^-spaces, 
where 1 <P< OO (for p < 1 the Minkowski inequality (B.14) below goes in the 
wrong direction, so that, by failure of the triangle inequality, eq. (B. 8 ) below fails to 
define a norm). Such spaces are defined on some set X, hence we write £P{X). 

If X = {xi,...,Xn} is finite, with cardinality n = |X|, then £P{X) consist of all 
function / ; X —>■ C with pointwise operations, so that £P{X) = C” as vector spaces 
through the map / (/(xi),... ,f{xn)), where C” is equipped with a specific (and, 

for p ^2, unusual) norm. However, by Proposition (B.5) we may as well take p = 2 
and nothing has been gained compared with the linear algebra of Appendix A. 

Therefore, life starts with infinite sets X, and we begin with the simplest of those, 
viz. A = N (but to avoid unnecessary duplication with regard to later generalization, 
although for the moment we assume A = N, we still write A for the underlying set). 
We define = £P{X) as the set of functions / : A —>■ C that satisfy 

E l/Wr < (1 <p < °°); 

xGX 

sup|/(x)| < OO (;, = oo). 

xex 

As will be shown in far greater generality (cf. Theorem B.9), the point is that for any 
1 A P < the set £P{X) thus defined is not merely a vector space (under pointwise 
operations); it is even a Banach space in the norm 

ll/llp= (1<P<-); (B. 8 ) 

ll/lloc = sup{|/(x)UGA} = inf{C>0| |/(x)| <CVxGA}. (B.9) 

The case p = 2 is unique in that £^{X) is also a Hilbert space in the inner product 

(/,^) = E7W^W- (B.IO) 

xex 

As we now outline, these expressions may be generalized to any set, to which 
end we should define the meaning of (possibly uncountable) sums Lxgx- Although 
the generality below will only be used in §B.12, it is convenient (at little extra cost) 
to cover more general codomains for / than just the complex numbers C. 

Definition B. 6 . LetX be a set, V a normed vector space, / : A —>■ V some function, 
and V G y. The sentence 'LxeX f{x) = V means that for each e > 0 there is a finite 
subset F GX such that for each finite subset G G X with F G G, we have 

II E/W“''ii 

xeG 


(B.6) 

(B.7) 
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In terms of nets, this means that the net s = in V indexed by finite 

subsets F cX (orderedby inclusion), where Sf{x) = converges to v. 

For X = N and V = C we may take F to be {1,..., A^} and G to be {1,..., n}, 
where n>N, in which case we recover the usual notion of convergence of sums (i.e. 
Ve > 03A^ G NVn > ; | LJ=i f{x) — v| < e). However, since also more general F 

and G are allowed. Definition B.6 is in fact equivalent to absolute convergence: 

Lemma B.7. Let X be a set and let f: X ^ C be some function. 

1. There exists z G C such that Y.xex fix) =zijf f^xex \fix)\ < °°- 

2. If fix) > Ofor each x GX, then, in the sense of Definition B.6, 


E fi^) = sup 

xex 


E fix),F C X finite 


[xeF 


(B.ll) 


which is true even if the supremum on the right-hand side is infinite (in which 
case the left-hand side simply does not converge). 

Therefore, for /: X —>■ C, one may use (B.ll) to check if Y^xex \fix) | < °°, in which 
case it makes sense to try and find the value v of Lxgx f{x) as in Definition B. 6 . 

Proof. 1. We write / = /i + if 2 , with /,■: X —>■ K, and for given GcX, write G,± = 
{x G G I ±/)(x) > 0} (the ambiguity at those x where f{x) = 0 is irrelevant). Then 

IE/wi ^ E i/wi ^ E1/1 wi + E1/2WI 

xeG xeG xeG xeG 

= E E fiix)+ E Mx)- E / 2 (x) 

xGG\^ xGG 2 -|- xGG 2 — 

<4supJ| ^ /(x)|,aG{l+,l-,2+,2-}l. (B.12) 

t sCGa } 

Using Proposition B .8 below, the first inequality in (B.12) shows that absolute 
convergence implies convergence in the sense of Cauchy, whereas the last in¬ 
equality (i.e., \fix)\ < 4sup- • •) shows the converse. 

2. We pick e > 0 and abbreviate the right-hand side of (B.ll) as a. By definition 
of the supremum (which we assume finite) there is a finite F C X for which 
a > ’E.xeffix) > <y — £. Since the terms are positive, for any finite G F we 
have Y.xeG fix) > T.x€F fix) and hence also a > Y.xeG fix) > O’ — from which 
I LxgG fix) ~ o| < e. Hence Y.x€X fix) = O by Definition B. 6 . 

The same argument works if (7 = 0 °, in which case for any 0 <M <°o there is a 
finite F CX for which T,xeF fix) > M, and hence certainly Y^xeGfix) > M. □ 

Leaving its proof to the reader, we state the Cauchy condition for convergence: 

Proposition B. 8 . We have Yxex fix) = vfor some (necessarily unique) v G V, in the 
sense of Definition B.6, ijffor each e > 0 there is a finite subset F GX such that for 
each finite subset G' C X\F we have || YxeG '/WII < 
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For uncountable set X, Definition B .6 is not as bad as it may sound, since when¬ 
ever Y,x€X l/(•^)l < only a countable number of terms can be nonzero (proof by 
contradiction; if not, there must be an n G N for which infinitely many x satisfy 
|/(x)| > Ijn (nested proof by contradiction; if not, then for all n, only finitely many 
X satisfy |/(x)| > Ijn, and hence, a countable union of finite sets remaining count¬ 
able, only a countable number of x can have f{x) ^ 0), so the sum of |/(x)| over 
those X alone already diverges). In particular, for X = N the sum in (B. 6 ) has its 
usual meaning. However, even for X = N, the sums just defined only have their 
usual meaning if the series in question is absolutely convergent (the standard coun¬ 
terexample of a real series that is convergent but not absolutely convergent is 
given by x„ = (—!)"/«; in the above light, taking G = FiJE, where £ is a large but 
finite set of even numbers, then makes | Y.ieG^n— A as big as you do not like). 

Using the triangle inequality for the norm and the Cauchy criterion for conver¬ 
gence, it is easy to show that if V is a Banach space and Y.xex ll/WII < °°y then 
the sum Y.xexfiA exists in V (i.e., it equals some v G V in the sense of Defini¬ 
tion B. 6 ). The implication is one-sided, though; the latter sum may exist even if the 
former does not. For example, take V = ^^(N), pick some / G f^(N), and define 
/; N —>■ f^(N) by /(x) = f {x)5x, where 5^(y) = 5xy (and hence || 5^||2 = !)• Then 

LII/WI|2=EI/W| = ||/||i. 

jcGN xGN 

Now Lxgn/W = / exists per assumption that / G £^(N) and hence ||/||2 < 
which is implied by, but is not equivalent to ||/|| i < See also §B.12 below. 

In any case, the meaning of the possibly uncountable sums in (B. 6 ) and (B. 8 ) 
should be clear now, as only finite sums (B. 11) are involved; for (B. 10), by Holder’s 
inequality (B.15) below for p = q = 2, the sum in question is absolutely convergent, 
and hence it falls within the scope of Definition B .6 and Lemma B.7. 

Theorem B.9. For any I < p < the set £^(X) is a vector space under pointwise 
operations. Moreover, iP(X) is a Banach space in the norm (B. 8 ) - (B.9). 

Proof. 1. PP is a vector space. The case p = oo is obvious. For \ < p <°°, use the 
convexity of the function f i—for f G [0,°°). For convex functions one has 
fikih +u)) < 5 (/(L)+/(f 2 )), so that {\{t\ +t 2 )Y < +Y). Combined 

with monotonicity of the function f i—on [ 0 , °°), i.e. s <t s^ <tP, this gives 

l/W +g(x)r < (|/(x)| + |g(x)|)'> < 2P-\\fix)f + \8ix)\n, (B.13) 

so that summing over X gives \\f + g\\p < 2'’^'(||/||p + klip) < °°- 
Hence if / G and g G then f + g € 

2. II • \\p is a norm on £P. The case p = is, once again, obvious. For 1 < p <°°, 
the only nontrivial part is the triangle inequality 

\\f + 8\\p<\\f\\p + \\8\\p, (B.14) 

called the Minkowski inequality. This follows from Holder’s inequality. 
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||/^||i<||/||p||^||„ (B.15) 

which is valid for f G and g G where 1 < p < o° and 1 < ^ < satisfy 

-+^=1. (B.16) 

P q 

Thus one has ^ = ;?/ (p — 1) for 1 < ;? < oo, or ^ = oo for ;? = 1, or ^ = 1 for 
p = oo. One calls p and q conjugate exponents (so that p = 2 is self-conjugate). 
3. PP is complete in the norm || • \\p. We must prove that some Cauchy sequence (/<,) 
in converges. This takes three steps, which we first prove for \ <p <°°. 

a. Find a candidate f for the limit. Since {fk) is Cauchy, for each e > 0 there 
exists /f G N such that — fi\\p < e for all k,l > K, or 

IIA-//II^= EIAW-//Wr<e'’- (B.17) 

Hence \ fk{x) — fi(,x)f < for all x, so {fk{x))k is a Cauchy sequence in C. 
Since C is complete, {fk{x))k converges, hence we may define /: X —>■ C by 

/(x) = lim/,(x). (B.18) 

k^oo 


b. Show that f G . Note that 

w? = supEi^wr (B.19) 

FcXxeF 

where the supremum is over all finite subsets F CX. For fixed F we have 
j:\fkix)-Mx)f<ef 

xeF 

Since the sum is finite, we may take lim^-^oo, giving i.GFi/w-//wr<e'’- 
By (B.19), the sup over all finite F yields: Ve > OB/f G N such that Ml > K, 
we have ||/ —//||p < e^. For fixed e and I, this says that / —/; G so / G 
because / = (f — fi) +fi with f G U’, and we know that is a vector space. 

c. Show that fk^ f in This is contained in the previous step, since we had 

Ve>03/:GNV,>;f: ||/-/,||p<e. (B.20) 

But this is the same as lim/^.,o ||/ — //||p = 0, or /; —:► / in £F. 

The proof for p = oo is virtually the same, with (B.19) replaced by 

||g||oo = sup sup{|g(x)|}. (B.21) 

FcXxeF 

Within the finite supremum sup^^p \fk{x) — fi{x)\ < e, we may take the limit 
oo once again, followed by a supremum over F CX. □ 
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B.3 Banach spaces of continuous functions 

Further Banach spaces that can be defined without measure theory come from topol¬ 
ogy, notably from the class of locally compact spaces X (like N, or M", etc.). 

For any f :X C, define the support of / as the closure of the set where / 0. 

Definition B.IO. Let X be a locally compact space. Then: 

• C{X) is the set of all continuous functions f :X ^C; 

• Cc(X) is the set of all continuous functions f :X ^ C with compact support; 

• Co{X) is the set of all continuous functions f :X ^ C that vanish at infinity, 
i.e., for any e > 0 the set {x G X \ \f{x) \ > e} is compact, or, equivalently, for 
any e > 0 there is a compact set K GX such that \f{x) \ < £ for all x ^ K; 

• Ch{X) is the set of all continuous functions f : X ^ C that are hounded, i.e., 
there is a constant C >0 (which depends on f) such that \f{x) \ < Cfor all x GX. 

In general, one has the obvious inclusions 

Cc{X)CCo{X)CCbiX)GC{X), (B.22) 

with strict inclusions ijfX is non-compact, and equalities ijfX is compact. 

For example, ifX = K, then f{x) = exp(—lies in Co, whereas f{x) = 1 is in C^. 
If X is discrete, the space £ciX) and t°{X) of the previous section are the same as 
Cc{X) and Cb{X), respectively, and we may also write io{X) = Co{X). 

Theorem B.ll. The sets Cc{X), Cq{X), Cb{X), and C{X) are vector spaces under 
pointwise operations, and Co{X) and Ci,{X) are Banach spaces in the sup-norm 

||/iu = sup{|/(x)|}. (B.23) 

xex 

In particular, if X is compact, then C{X) is a Banach space in the norm (1.24) 

Proof. Only completeness in the sup-norm (B.23) is nontrivial. We use the fact 
from elementary analysis that sup-norm (i.e., uniform) limits / of sequences (/„) of 
continuous functions exist (they are given by the pointwise limit f{x) = lim„ /(x)) 
and are continuous. Therefore, concerning Co(2f) we just need to show that the limit 
/ of some sequence (/„) in Co{X) vanishes at infinity. Indeed, for given e > 0, 
since fn—^f uniformly, we can find N such that |/(x) —/„(x)| < e/2 for all x 
and all n > N. Since /„ G Co{X), we can also find some compact K GX such that 
\fn (.r) I <£{2 for all x ^ /T and all n. Hence for x^K and n > N, 

\fix) I < |/(x) - fn (x) I + |/„(x) I < e/2 + e/2 = e. (B.24) 

To show that the limit / of a sequence (/„) in Ct, is again bounded, note that for 
e > 0 we have |/(x) — fn{x) \ < £ for n> N and \f„ (x) | < C„, both for all x, whence 

l/WI < |/W-/«(x)|-f |/„(x)| <e-fC„ <oo, (B.25) 

so / is bounded and hence lies in C*(2f). □ 
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B.4 Basic measure theory 

Measure theory studies measure spaces where X is a set, and; 

• E C (X) ha so-called a-algebra of subsets of X, which means that: 

1. X gZ; 

2. If A G 2^, then A'^ G (where A^ = 2f\A is the complement of A); 

3. If A„ G 2^ for n G N, then U„A„ G E (i.e., E is closed under countable unions). 

It follows that 0 G 2;, and that E is closed under countable intersections, too. 

• ll'.E ^ called a (positive) measure, is countably additive, i.e., 

Ai(U„A„) =^^(A„), (B.26) 

n 

whenever A„ G 2^, n G N, A, fl A^ — % for all i ^ j. The obvious convention here 
is that f -p cxj = cxj for any t G as well as oo + oo = oo. Countable additivity is 
indispensable in almost every limit argument in measure theory. 

A probability space is a measure space {X,E,ii) for which pL{X) = 1. More gener¬ 
ally, a measure space is called finite if /r(2f) <o°, which evidently implies ft (A) < oo 
for any A G 2^, and a-finite if 2f is a countable union X = U„A„ with p (A„) < for 
each n. For example, 2f = R is a-finite, whilst X = [0, 1] with Lebesgue measure is 
finite. The non-CJ-finite case is pathological and hardly occurs in practice. 

This definition of a cr-algebra marks a difference with a topology on X, which is 
a collection ff{X) of open subsets (containing X and the empty set 0) that is closed 
under arbitrary unions and finite intersections (but not under complementation!). 
Nonetheless, topology and measure theory are closely related: 

1. Any topological space X gives rise to a cr-algebra ^{X), viz. the smallest cr- 
algebra in ,^(2f) that contains ^{X) (this exists and equals the intersection of all 
cr-algebra that contain ^{X), where one notes that the intersection of any family 
of cr-algebras is again a cr-algebra). Elements of t^{X) are called Borel sets. 

2. The definition of a continuous function f :X between topological spaces X 

and T as a function for which /^*(V) G ff{X) for each V G i^(T), is copied by 
saying that f :X h measurable with respect to given cr-algebras Ex (on X) 
and Er (on Y) if G Ex for any B € Ey. 

3. If X and Y are topological spaces and Ex = tW{X), Ey = ■1^{Y), then it it easy 
to show that / is (Borel) measurable iff f^^{B) G Ex merely for any B G ff{Y), 
from which it follows that each continuous function is measurable. For /; 2f —R 
to be measurable it is even sufficient that f^^{{t,°°)) € Ex for each f G R. 

4. The above condition of cr-finiteness is often used just in case the A, are compact. 

An important goal of measure theory is to provide a rigorous theory of inte¬ 
gration', here the key idea (due to Lebesgue) is that in defining the integral of some 
measurable function /: 2f —> R, one should partition the range R rather than the do¬ 
main X, as had been done in the Calculus since Newton (where typically X C R"). 
This, in turn, suggests that / should first be approximated by simple functions. 
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These are measurable functions s :X ^ 1R+ with finite range, or, equivalently, 

i = (B.27) 

i 

where A, > 0, A, G E, and n < °°. Such a representation is unique if we require 
that the sets A, are mutually disjoint and the coefficients A, are distinct; namely, if 
{xi,... ,x„} are the distinct values of s, one takes A,- = and A,- = x/. Given 

some measure /r, we further restrict the class of simple functions to those for which 
/r(A,) < oo. One then first defines the integral of a simple function s, as in (B.27), by 

f dlls = yXill{Ai)- (B.28) 

Jx y 


a nontrivial argument shows that the right-hand side is independent of the particular 
representation (B.27) of s used on the left. Granting this, linearity of the integral 
on simple functions is immediate. Subsequently, for positive measurable functions 
/ > 0, writing s < / iff s(x) < /(x) for each x&X, one defines the integral by 

= sup|y (i/Ti I 0 < s </,s simple|. (B.29) 

For measurable functions / : X — C, one first decomposes / as 

3 

f=yi’'fk,fk>0, (B.30) 

k=0 

where, writing / = Re(/) + /Im(/) = /' + if", fo = f[, f 2 = fL, f\ = f'l, and 
/3 = /", so that r =fl- /• for * = one may take /* = i (|/* | - /*). 

On this basis, one then defines the integral by linear extension of (B.29), that is, 

f d^f=yi'^ f dpft. (B.31) 

Jx ^to 

We call / integrable with respect to p, writing / G J§f * (X,Z,/r), if 

[ dpt I/I < (B.32) 

Jx 

this implies that each positive part fk, and hence also / itself, is integrable, i.e., 

[dpf<oo. (B.33) 

Jx 

However, (B.33) does not imply (B.32). From (B.32) one has the useful estimates 


I dpf < [ dp\f\<\\f\\Tp{x), 

Jx Jx 


(B.34) 
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where the essential supremum of / (with respect to p) is defined by 

= inf{t G [0,oo] I I/I < f /r-almost everywhere}, (B.35) 

where |/| < t p-a.e. means that p{{x G X \ |/(x) > fj) = 0}. In (B.34), the expres¬ 
sions ll/ll^'^ and/or p{X) may well be infinite (in which case the second estimate 
still holds, of course!). However, if X is a locally compact space (see the next sec¬ 
tion), p is finite, and / G Co{X) or even / G Ch{X), then all of (B.34) is finite. 

Linearity of the integral is far from trivial: the proof relies on linearity for simple 
functions, as well as on a fundamental approximation lemma: 

Lemma B.12. If f > 0 is measurable, there is a monotone increasing sequence of 
simple functions s„, i.e., such that 0 < si < S 2 < ■ ■ ■ Sn s«+i <•••</ pointwise, 
for which s„^ f pointwise (i.e., lim„^ooSn(s) = f{x) for each x €X). 

Furthermore, one needs one of the two great convergence theorems of measure the¬ 
ory named after Lebesgue, both of which (for future use) we now state. In these 
theorems (as well as in many others), we say that a measurable functions f :X 
has some property p-almost everywhere (p-a.e.) if the set where / does not have the 
said property has measure zero. For example / = 0 p-a.e. means that f{x) = 0 for 
each X N, for some measurable set N with p (N) = 0 (as they say, “morally”, the 
behaviour of measurable functions on subsets of measure zero should not matter). 

Theorem B.13. Let (/„) be a sequence of (complex-valued) measurable functions. 

1. Dominated Convergence.' if (fn) converges pointwise p-a.e. to some function f 

and |/«(x)| < g{x) p-a.e. for some g G .^^{X,L,p), then f G and 

lim [ dpf„= [ dp f. (B.36) 

Jx Jx 

2. Monotone Convergence.' if fn > 0 and (fn) is monotone increasing p-a.e., and 



then lim„^oo/n(x) = f{x) exists p-a.e., f G .Sf^{X,Z,p), and (B.36) holds. 

Note that the first conclusion of the monotone convergence theorem is an assump¬ 
tion in the dominated one! Either way, the fact that the pointwise limit function / is 
integrable, being implicit in the notation / G ' {X,E,p), is part of the result. 

Corollary B.14. Integration is linear, i.e., if fi, fi are integrable and /i ,/2 G C, 

[ dp(Xlf + X2f2)=h [ dpfl+X2 f dpf2. (B.38) 

Jx Jx Jx 

Proof. If /i > 0,/2 > 0, let —>■ f\ and s^n'^ / 2 , as in Lemma B.12. Then the 

conditions of the monotone convergence theorem hold, because integration is itself a 
monotone operation (i.e., if / < g, then .fx f < J^dpg). Combined with linearity 
on simple functions (as already established above), this yields the claim. □ 
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B.5 Measure theory on locally compact Hausdorff spaces 

For us it suffices to deal with locally compact Hausdorff spaces X. Our main goal 
is Corollary B.21. We say that a map 9 : C(X) — C is positive if (p{f) > 0 whenever 
/ > 0 (pointwise). We also write ^{X) for the set of open subsets of X, whilst {X ) 
denotes the set of all compact subsets of X. We first assume that X is compact. Any 
finite measure p : !3§{X) —>■ [0,°°) gives rise to a positive linear map <p \ C{X) ^ <C, 

(p{f)= [ dpf, f€C{X). (B.39) 

Jx 

Conversely, any such map canonically defines a finite measure p at least on opens 
U G ^(X) and on compacta X G Jlf (X) (which are key examples of Borel sets) by 

p[U) = sup{(p(/) I / G Q(t/),0 < / < lx}; (B.40) 

p{K) = inf{(p(/) I / G C,(X),0 < / < \xJ\K = Ilf}. (B.41) 

Subsequently, this preliminary measure is (hopefully!) to be extended to at least all 
of !^{X), i.e., to all Borel sets, in such a way that p recovers (p via (B.39). 

This works, and one even obtains a bijective correspondence between finite mea¬ 
sure spaces {X,E,p) and positive linear maps ^ : C(X) —>■ C if the former are sub¬ 
jected to two additional conditions, predicated on having £i{X) C E, namely: 

• completeness, in that p{B) — 0 and A C B for A G t3^(X), B G E imply A G E; 

• regularity, i.e., for a given measure p : E ^ [0,°°], for any A G E, one has 

p*iA)=p,{A)=p{A), (B.42) 

where the outer measure p* and inner measure are defined by 

p*{A)=mf{p{U) \ U 2A,U G ^{X)}; (B.43) 

AIh.(A) = sup{p{K) I KCA,K G .JfiX)}, (B.44) 

respectively. These expressions apparently make sense for all subsets A CX, but 
lovers of the Banach-Tarski Paradox may be reassured that p* and p^, typically 
fail to be countable additive if they are seen as maps from J3^{X) to [ 0 ,°°]. 

For future reference we also define {X,E,p) to be inner regular if (merely) 
Pt{A) = p{A) for A G E, and outer regular if (merely) p*{A) = p{A), A G E. 
So a regular measure is both inner and outer regular. We are now in a position 
to state the Riesz Representation Theorem (often attributed also to Radon). 

Theorem B.15. Let X be a compact Hausdorff space. There is a bijective corre¬ 
spondence between complete regular finite measure spaces {X,E,p) and positive 
linear maps (p : C{X) ^ C, explicitly given as follows: 

• The measure space {X,E,p) defines (p through (B.39), assuming (B.29) - (B.31); 

• The map (p defines the pair {E,p) in three steps: 
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1. II is given on opens U and on compacta K by (B.40) and (B.41), respectively; 

2. £ is defined as the collection of all sets A £ ^{X) for which p* (A) = p^:{A); 

3. p is given on all of£ by p{A) = p*(A), using (B.43), or, equivalently (given 
the previous point), by p{A) = Pt{A), based on (B.44). 

We omit the lengthy proof, expect by announcing that Theorem B.15 may be seen as 
a special case of the more advanced Choquet theory reviewed in §B.l 1. For now, just 
note that expressions like (B.40) and (B.41) are really desperate attempts to define 
“p{A) = 9 ( 1 , 4 )”, which is OK for finite X, but in general is ill defined because even 
for Borel sets A, the characteristic function 1,4 is rarely continuous on X. 

We note that p has to be finite, since obviously p{X) = (p{lx)- One can say a 
little more about this. A linear map 9 : C(X) — C is bounded if, for some 0 < C < 0 °, 

\(pif)\<C\\f\\^(fGC{X). (B.45) 

In that case, the following expression, called the norm of 9 , is < C, hence finite: 

|| 9 ||=sup{| 9 (/)|,/GC(X),||/|U = l}. (B.46) 

Proposition B.16. LetX be a compact Hausdorff space. If a linear map 9 : C{X) 

C is positive, then it is bounded, with norm 

|| 9 ||= 9 (lx). (B.47) 

Proof. Positivity makes {f,g) = (p{f*g) a pre-inner product on C{X), so by (A.l) 

with v=lx and w = f,sNe find | 9 (/)P < 9 (|/P) 9 (lz) for any/- If H/IU = 1, then 
pointwise 0 < |/p < lx, so by positivity, 9 (|/P) < 9(lx)- Hence | 9 (/)| < 9(lx), 
so that II 9 II < 9(lx)- Finally, taking / = lx in (B.46) gives equality. □ 

A state on C(X) is a positive linear functional co ■C{X) with ft)(lx) = 1- 

Corollary B.17. IfX is a compact Hausdorff space, there is a bijective correspon¬ 
dence between states on C{X) and complete regular probability measures on X. 

We now move to the next case in difficulty, where X is assumed to be a-compact, 
in being a countable union of compact sets, i.e., X = U„K„, where K„ G J(P (X). Us¬ 
ing a little topology, this is actually equivalent to X being a perhaps more appealing 
union X = U„t4, where each t4 is open with compact closure Hf, and tjf C t/„+i . 
This, in turn, implies that X = with fG' C all compact. If {X,p,a) is a 
measure space where X is ( 7 -compact topologically, IM{X) C E, and 

p{K)<oo^ (KGJTiX)), (B.48) 

then X is also a-finite measure-theoretically. Since these are the only a-finite mea¬ 
sure spaces we will consider, with a slight change in terminology we call a locally 
compact measure space {X,E,p) a-finite if it is also O’ -compact and (B.48) holds. 

The new point compared to the compact case is that functionals like the above 
9 should now be defined on the space Cc{X) of continuous functions on X with 
compact support. Otherwise, Theorem B.15 may be repeated almost verbatim: 
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Theorem B.18. LetX be a (7-compact Hausdorff space. There is a bijective corre¬ 
spondence between complete regular (7-finite measure spaces {X,E,IJ.) and positive 
linear maps (p : Cc{X) ^ C, explicitly given as in Theorem B.15. 

For the sake of completeness we also state Theorem B.18 in the case where X is not 
even assumed to be (7-compact. In that case, inner regularity may be lost: 

Theorem B.19. Let X be a locally compact Hausdorjf space. There is a bijective 
correspondence between complete outer regular measure spaces {X,E,ii) satisfying 
(B.48), and positive linear maps 9 : Cc(X) —>■ C, explicitly given as in Theorem B.15, 
except for the fact that E now consists of all A G IP {X) for which p{Ar\K) <°o and 
p.*{Ar\K) = iJ.{Ar\K) for any K G {X ). In that case, fi is defined by 

B{A)=p*{A), A€E. (B.49) 

However, this generality will not really be needed for our purposes, which will 
only require measures, in which case outer regularity implies regularity. 

In order to generalize Corollary B.17 to the (7-compact case, or even to the lo¬ 
cally compact case, we must involve the Banach spaces Cc{X) and Co(X) of the 
previous section. Also for linear maps ^ : Cc(7f) —>■ C or ^ : Co(X) —>■ C we use the 
notation (B.46), where now the supremum is taken over f gCc{X) and f gCo{X), 
respectively. For example, in the latter case, provided (B.45) holds, we have 

||(p|| = sup{|(p(/)|,/G Co(X), ll/IU = 1}. (B.50) 

Lemma B.20. Let X be a locally compact Hausdorff space. 

1. Cc{X) is a dense subspace of Co{X) with respect to the norm (B.23). 

2. For a positive linear map (p : Cc(X) ^ C, the following are equivalent: 

a. (p is bounded, as in (B.45); 

b. (p can be extended to a positive linear map (p : Cq{X) ^ C. 

In particular, a positive linear map (p : Co{X) ^ C is automatically bounded. 

Proof. 1. The first claim means either of the following two equivalent properties: 

• For any f gCo{X) there is a sequence (/„) in Cc{X) converging to /; 

• For any / G Co{X) and e > 0 there is g G Cc{X) with ||/ —g|| < e. 

We prove both. For some given f GCq and e > 0, find the usual compact K such 
that |/(x)| < e outside K. Urysohn’s Lemma gives h G Cc{X) with 0 < h{x) < 1 
for allxGX and h{x) = 1 for allx G K. Take g = fh G Cc{X), so that ||/ —gjjoo < 
e. For e = Ifn, rename the g thus constructed as f„. Then ||/ —/n||oo —>■ 0. 

2. To go from 2.a to 2.b, using the previous item, let fn^f uniformly (i.e., in the 
sup-norm), and define the extension ^ : Co(X) —>■ C by (p{f) = lim„ <p(f„). This 
limit exists, since \(p{fm) — tp{fn)\ < C\\fm— fn\\oc, so that, (/„) being convergent 
and hence Cauchy in Co{X), the sequence {(p{fn)) is Cauchy in C. The value 
(p{f) is easily verified to be independent of the approximating sequence (/„). 
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Finally, the approximation in 2.a preserves positivity, i.e., if / > 0 then f„ > 0, 
so also (pif) > 0, as it has been defined as the limit of a positive sequence. 

By definition, the converse implication 2.b —2.a is equivalent to the claim that 

sup{|(p(/)|,/ G Co{X), ll/IU <!}<-, (B.51) 

which in turn is equivalent to the apparently weaker claim to the effect that 

sup{|^(/„)|,nGN}<oo, (B.52) 

for any sequence (/„) with ||/„|| < 1. Indeed, if the first supremum were infinite, 
then for each n G N there is f„ such that \(p{fn) \ >n, and (B.52) could not possibly 
hold. Furthermore, (B.52) need only hold for non-negative functions /« > 0 (still 
with \\fn\\ < 1, of course) cf. (B.31), since \(p{fk)\ — (p{fk) < C for each k = 
0,..., 3 implies |^(/)| < 4C. And this, finally, reduces to the claim that 

CX3 

E S{n)(p{f„) <oo,Vg€ £^(N),g(n) > 0. (B.53) 

n=l 

Namely, if the sequence ((p{fn)) where unbounded, it would be trivial to find 
such a summable function g for which the sum in (B.53) diverges (for example, 
take a subsequence for which > ni and take g such that gn^ = 1 /m^)- 

To prove (B.53), then, given that /« > 0 and hence (p{fn) > 0, with ||/„|| < 1 , first 
note that Y.nS{^)fn converges in Co{X) (since it is obviously absolutely conver¬ 
gent, and any absolutely convergent series in a Banach space converges). Calling 
the sum h, for any N <°° we have g{n)fn < h and hence, by positivity of 
(p, also g{n)tp{fn) <tp(h)< Letting N ^ gives (B.53). □ 

We now define astute on Co{X) as a positive (and hence bounded) linear functional 
(B : Co (2f) — C with 11CB11 = 1; this is consistent with the terminology for the compact 
case because of (B.47), as well as with the terminology for C*-algebras. 

Corollary B.21. Let X be a locally compact Hausdorff space. There is a bijective 
correspondence between positive linear functionals on Co{X) and complete regular 
finite measures on X, explicitly given as in the bullet points of Theorem B.15. 

In particular, states on Cq{X) correspond to regular probability measures on X. 

Proof All that remains to be shown is that, under (B.39), we have 

\\(p\\=p{X), (B.54) 

so that, in particular, the case ||^|1 = 1 corresponds to IJ.{X) = 1. For compact X, 
eq. (B.54) is immediate from (B.47). For locally compact X, we immediately see 
from (B.39) and (B.50) that ||^|| < IJ.{X). To saturate this inequality, we use inner 
regularity of the measure p corresponding to <p, cf. Theorem B.19 and subsequent 
comment. From (B.42) and (B.44), for any e > 0 we can find K G Jff{X) with 
p{X)-p{K) < e. Now use Urysohn’s Lemma to find / G CffX) such that 0 </<l 
and fj^ = 1. Then (p{f) > p{K), and, letting e ^0, eq. (B.54) follows. □ 
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Finally, we extend the above corollaries to the entire (Banach) dual Co(X)*, i.e., 
the space of all (i.e. not necessarily positive) bounded linear maps (p : Co{X) ^ C, 
equipped with the norm (B.50). As we shall see more generally in §B.9, this is a 
vector space (under pointwise operations) and even a Banach space in its own right. 

From the point of view of measure theory, the relevant concept is that of a com¬ 
plex measure. This is a map jj.: E satisfying the countable additivity condition 
(B.26), as in the positive case. In the complex case this condition implies that p is 
finite. One then (trivially) has a decomposition jj. = jj.' + ip'', where p' and p" are 
countably additive maps Z —>■ K (just take p' = j{p + p*) and p" = — \i{p — p*), 
where p*{A) = p{A)), and (nontrivially) has the (Hahn)-Jordan decomposition: 

Theorem B.22. Let E be a <J-algebra on a setX and let p be a (finite) signed mea¬ 
sure, i.e.,a countably additive map Z —>■ M. Then there is a unique decomposition 

p=p+-p^, (B.55) 

where the measures p± : E —>■ M+ are given by: 

p+(A) = sup{p{B)\BCA,BeEy, (B.56) 

P-{A) =-inf{p{B) \BQ A,B£E}, (B.57) 

and p+ and P- are mutually singular in that there is a set N G E such that 

p+iN)=p^iX\N)=0. (B.58) 

We will not prove this, just noting that in terms of the total variation |/r| of p, i.e.. 


|Ai|(A) = sup<^^|Ai(A„)| , (B.59) 

Ugn J 

where the supremum is taken over all measurable partitions A = U„A„, one has 

p±=^^{\p\±p). (B.60) 

Fom the point of view of C*-algebras it is more natural to start from bounded 
linear functionals on Co{X). First, we call a map (p : Co{X) ^ C hermitian if 

9{n=W) {n^)=W))- (B.61) 

Theorem B.23. 1. Any functional (p G Co{X)* has a unique decomposition 

(p = cp'icp", (B.62) 

where the functionals (p' G Cq(X)* and (p" G Cq{X)* are hermitian. 

2. Any hcxmitiwa functional (p G Co(2f)* has a decomposition 

(p = (p+-(p_, (B.63) 
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where the functionals (p± GCo(X)* are positive, and are given on f >0 by 


(p+{f) = 
(p-{f) = 

G Co(X),0 < g < /}; 

— inf{(p{h),h G Co(7f),0 <h< /}. 

(B.64) 

(B.65) 

3. These expressions satisfy 

ll<Pll = ll<P+ll + ll<P-ll, 

(B.66) 


and any positive functionals (p± S Co(X)* that satisfy (B.63) as well as (B. 66 ) 
are necessarily given by (B.64) - (B.65). 

4. Any functional (p G Co{X)* is a linear combination of at most four states. 

Proof 1. Take f' = ^{(p + (p*) and (p” = —^i((p — (p*), where (p*{f) = (p{f*). 

2. The range h : 0 < h < f is the same as the range h : 0 < f — h < f, so that 
(B.64) - (B.65) gives (B.63). Positivity of (p+ follows because the value ^(0) = 0 
is included in the supremum in (B.64), which therefore can only be > 0, and 
likewise -~(p- is negative (and hence <p^ is positive) because ^( 0 ) = 0 is included 
in the infimum in (B.65), which therefore can only be < 0. 

3. We first prove (B. 66 ) for compact X, so that lx G Co{X) = C{X). From (B.47), 

ll^ll < ||^+|| + ||^-|| = ^+(lx) + ^-(lx) (B.67) 

= sup{^(g),0<g< lx}-inf{^(/i),0</i< 1}. (B. 68 ) 

For any e > 0, there is g such that (p{g) is close to the supremum in (B. 68 ) by 
jg, and likewise there is h such that (p{h) is close to the infimum in (B. 68 ) by the 
same amount, so that 


\(p+{lx) + (p-{ix)-(p{g-h)\<e. (B.69) 

Since 0 < g < lx and 0 < h< 1, we have ||g — h\\ < 1, and thereore 

<P(^-/*)<||(p||k-/*||<||(p||. (B.70) 

Hence (B.67) gives 

ll^ll < ll^+ll+ 11^-11 < ll^ll+e, (B.71) 

so letting e —^ 0 yields (B. 66 ). 

For locally compact X, we reduce the proof to the compact case by forming the 
one-point compactification X of X, cf. §C. 6 . As a set, this is A =XU 1°°}, where 
oo is a singleton. As a space, the open sets in X are the open sets in X plus those 
subsets of X whose complement is compact in X. The obvious injection i:X ^X 
is continuous, and any f gCq{X) extends uniquely to a function / G C{X) that 
vanishes at the compactification point, i.e., f{°°) = 0. This yields an isometric 
embedding Co(7f) ^ C{X). Furthermore, as vector spaces one has 
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C(X)=Co(X)©C-l;f. (B.72) 

Any linear map (p on Co{X) may then be extended to a linear map (p on C{X) via 

(p{f + Xl^) = (pif)+X\\(p\\, f€CoiX),X€C. (B.73) 

From the point of view of (B.39), this extension may alternatively be described 
as follows: extend the measure jj. on X that underlies ^ to a measure /f on 7f by 
/f (A U {°°}) = fl(A), A G E. This shows that (p remains positive when (p is, and 
using (B.54) and the analogue of (B.47) forX instead of X, we also obtain 

\\(P\\ = (Pil^)=fl{X)=^{X) = \\(p\\. (B.74) 

One may then repeat the proof of the compact case, using (p instead of (p. 

We just prove uniqueness for the compact case (in general, add dots as in the 
previous proof). Suppose <p — <p'j^ — <p'_. For / > 0, using (B.64) and (p'_ (g) > 0, 

(P+if) = snp{(p+{g)-(p'_{g),0<g<f} 

< sup{^^(g),0 < g < /} < (p+if)^ 

so \j/ = (p'^ — (p+ > 0. With (p'j_ = (pj- + ij/, imposing ||^|| = |^'^|| + and 
repeatedly using (B.47), we find || v/|| = 0, and hence xj/ = 0. 

4. This is trivial from parts 1-2, noting that any nonzero positive functional (p =t(0 
is a multiple of a state (O = ^/||^||, with t = ||^||, since obviously ||(b|| = 1. 

Combining this proposition with Corollaries B.17 and B.21, we finally obtain: 

Theorem B.24. Let X be a locally compact Hausdorjf space. The Banach dual 
Co{X)* of all bounded linear maps (p : Cq{X) ^ Cis isometrically isomorphic with 
the space MiX) of all complete regular complex measures jJ. on X, with norm 

IImII = ImIW- (b.75) 

In particular, if jl is real (i.e., hermitian as afunctional on C(X)), then (cf. (B.55)) 

M=p+{X)+Il-{X). (B.76) 

This implies Corollary B.21, including its crucial final claim to the effect that states 
on Co{X) correspond to regular probability measures on X. 

We briefly sketch an analogous result for finitely additive measures. Instead of a 
a-algebra of subsets of some set X, we now start from a so-called semiring: 

Definition B.25. A semiring of subsets ofX is a family Sf C SX'ipX) such that: 

7. 0 G ff; 

2. if A,B G 7^, then A flB G 1%; 

3. ifA,B G and B (ZA, then for the complement ofB in A we have A\B = 
where n <°°, each Bi G 7^, and the Bi are pairwise disjoint. 
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In fact, in all our examples a stronger version of axiom 3 holds: if A, B G and 
B <zA, then A\B G Indeed, we will typically have X = N and either 
or,^ = (i.e. the collection of finite subsets of N). 

Using the fundamental lemma for semirings, which states that if Ai,... ,A„ G 
there are finitely many pairwise disjoint Bi,...,BmVa^% such that U«A„ = UmBm, it 
can be shown that the complex linear span Step(X,,^) of the characteristic functions 
1^ (A G fii) is a commutative algebra under obvious pointwise operations. Since 
functions on Step(X,.^) are bounded, we may form the closure of Step(X,.^) in the 
supremum-norm; adding pointwise complex conjugation this yields a commutative 
C*-algebra called (which has a unit iff X = For example, we have 

r (N, ,f3^(N)) = r (N) = r; (B.77) 

r(N,^/(N)) =4(N)=co. (B.78) 

Definition B.26. A finitely additive measure on {X,Sf') is a map p : [0,°°] 

such that /r(AUB) = /r(A)+/r {B) whenever A,B G AUB G and AdB = <d. 

Similarly, we have finitely additive signed measures taking values in M, which admit 
a Jordan-Hahn decomposition (B.55) with (B.56) - (B.57), just as in the (7-additive 
case. We say that a finitely additive signed measure p. is finite if |/t (A) | <o° for each 
AgM, wd bounded if sup{|/r(A)|,A G < °o. With |/r | = +/r_, the bounded 

finitely additive signed measures form a real Banach space ba(X,t^) in the norm 

||Ai||=sup{|Ai|(A),AG^}. (B.79) 

Within this space, the probability measures stand out as those measures p that take 
values in [0,1] (so that p = /r+) and satisfy \\p\\ = 1. 

Functions in Step(7f may be integrated against measures in ha{X,t%) in the 
obvious way, cf. (B.27) - (B.28). This is well defined, and one easily infers that 

I /(/Ai*l<llAi||lklU, (B.80) 

Jx 

for any s G Step(X Hence we may extend the integral to any / G £°°{X,fi^) by 

/ dpf=lim / dps„, (B.81) 

Jx n^’^Jx 

where (s„) is any sequence in Step(2f,.^) converging to / in the sup-norm || • ||.>o. 
This is well defined by the usual arguments. At the end of the day, we obtain: 

Theorem B.27. LetX be a set equipped with some semiring Sf C !^{X\ 

• There is a bijective correspondence between finitely additive probability mea¬ 
sures p on {X,fif) and states (p on £°°(X,fii), given by (B.39) and (B.81). 

• This correspondence extends to an isometric isomorphism between ha{X,fif) and 
the real Banach space of bounded he rmitian functionals on £°°{X,£%). 

• This isomorphism of real Banach spaces extends (i.e. complexifies) to an isomor¬ 
phism between the complexification ha(X,£%)£ and the (Banach) dual £°°{JX,Sf)*. 
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B.6 LP spaces 

We return to the usual, countably additive setting for measure theory. In the previous 
section, the notion of a measure space (X,Z,/r) has mainly been used to provide an 
integration theory for continuous functions on X, though (B.29) suggested greater 
generality. In what follows, we keep the restriction to locally compact spaces X 
(although the theory is more general), but we expand the class of functions that 
can be integrated overX “against the measure /r”. This, then, leads to an important 
class of Banach spaces, called LP{X) = LP{X,E,ij.)-, some authors write LP{X,E), 
others L’’{pL). One may have examples likeX = C M" in mind, with Q measurable 
(typically open or closed, like X = K" or X = [0,1]), and /r being Lebesgue measure. 
On the other hand, one may think of X as a discrete space with counting measure 
(i.e., /r({x}) = 1 for each x G X), in which case the space LP{X) will reduces to the 
space £P{X) we already know; the typical case will be X = N. 

Definition B.28. Given a measure space {X,E,p) and a real number 1 < p < 

• For \ < p < °°, the set ^p{X) = ^P{X,L,ii) consists of all of measurable 
functions f :X ^ C that are essentially bounded (with respect to jl), i.e., 

[ dpt\f\P < (B.82) 

Jx 

• = Ef°°(X,E,p) is the set of measurable functions f :X ^ Cfor which 

inf{f S [ 0 ,oo] : I/I < f {^.-almost everywhere)} < °°. (B.83) 

• .yif is the set of all measurable functions f :X ^ C that vanish jx-a.e., that is, 

pL{{xGX\f{x)f0})=0. (B.84) 

• Noting that C .^^(X) for all 1 < p < °°, we put 

LP{X,E,p)=LP{X)=.^P{X)/Nlf. (B.85) 

To appreciate the perhaps somewhat mysterious condition (B.83), we write 

inf{f G [0,oo] ; I/I <fp-a.e.} = inf{t G [0,oo] : p{{x G X ,\f(x)\ > t}) =0}. 

Compare this with the expressions (defined for any function f :X ^ C): 

sup{|/(x)| \xGX} = inf{t G [0,°°] : |/(x)| < fVx GX] 

= mf{t G [0,°o] :{xGX, \f(x)\ > f} = 0} < oo, (B.86) 

which state the condition that / be bounded. Consequently, the stipulation that / be 
essentially bounded is the same as the condition that it is bounded, expect that the 
empty set in (B. 86 ) has been replaced by a measure-zero set. 
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Theorem B.29. For 1 < p < °°, the set FP(X) is a vector space under pointwise 
operations, as well as a Banach space, in the norm 

\\f\\p=(^J^d''x\fix)\'’y'’. (B.87) 

Likewise, L°°(X) is a Banach space in the norm 

ll/lir = inf{f G [0,-] :p{{xG X, |/(x)| > f}) = 0}. (B.88) 

Strictly speaking, elements of are therefore equivalence classes of functions 
rather than functions, the pertinent equivalence relation being 

iff Ai({x G X I fix) ^ g(x)}) = 0, (B.89) 

but whenever no confusion can arise, we write f G instead of / G or [/] G L^, 
as we have already done, for example, in (B.87) and (B.88); that is, the left-hand 
sides of these equations should officially be written as || [f\\\p for \<P< OO, Note 
in this respect that in (B.87) - (B.88) the function / on the right-hand side could 
be any representative of its equivalence class [/]. However, one cannot replace the 
right-hand side of (B.88) by ||/||oo, because (B.86) does depend on the representative 
/. Those who dislike (B.88) may, equivalently, write 

ll/lir = inf{||g|U,g~p/}. (B.90) 

One should be aware of the need to pass to the quotient (B.85) in the first place: 
the natural expressions (B.87) and (B.88) fail to define norms on and 
respectively, because the positive definiteness axiom in Definition A. 1.5c might fail. 
Indeed, although any / that is nonzero just on some null set is nonzero as an element 
of the vector space jSf^, one has ||/||p = 0. This problem is solved by passing to . 

The proof of Theorem B.29 uses both parts of Theorem B.13, which is concerned 
with a sequence (/„) of functions in .jSf' (A), where (A,Z,/r) is an arbitrary measure 
space. Note that on our definition of spaces, these pointwise limits themselves 
might not lie in but it is part of the conclusion of the convergence theorems that 
they do so up to some null set, and hence do define elements of L' . For this reason, at 
this point one must distinguish between / G ' and [/] G . Let us mention in this 
context that spaces are often constructed from measurable functions f :X ^ C, 
whose positive real parts /<, (cf. (B.31)) by definition take values in [0,°°]. This also 
leads to slightly more general versions of the Lebesgue convergence theorems, in 
which the f„ are allowed to be infinite on null sets. However, if / G L^, then |/| < 
/r-a.e., so little is lost by starting from functions f : X ^ C or f: X ^ R. 

Proof. We first prove Theorem B.29 for I < p <oo, Minkowski’s Inequality (B.14) 
holds for L’’ = L'’{X) just as it does for as does Holder’s Inequality (B.15), so 
it remains to prove completeness. To this effect, let (/„) a Cauchy sequence in . 
Then (/„) has a subsequence {fnflk such that 
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for each A: € N (indeed, for given e = 2^^, take n/^ to be the famously existing N for 
which Wfn — frnWp < £ for all n,m > N, etc.), and if lim^-^oo \\fni, —fWp = 0 for some 
/, then lim„^c)o \\fn — f\\p = 0 (this is a standard feature of Cauchy subsequences). 
We now rewrite using a little trick, and introduce an auxiliary function g by 

/% = /«i + E(/«w-/«,); (B-92) 

1=1 

^«* = I/«,I + El/«W-/«/!• (B.93) 

1=1 

Using (B.91), we estimate ||gn^||p < ||/«i ||p + Lf=i 2^^ which converges as k —>• oo. 
Hence sup^;. \\gn^\\ i < °o, so by the Monotone Convergence Theorem, lim<.^ocgnj, = h 
exists pointwise /r-a.e., with h G L^. Since g„^ > 0, we have /i > 0 at least /r-a.e., 
and with g = by continuity of x i—>■ x^Ip, we have —>■ g pointwise /r-a.e., 

with g G LP. Thus the series (B.92) converges (absolutely pointwise /r-a.e.) to some 
/. Since |/| < g, we also have / G LP. To prove that /„j. -G- f in LP (and not just 
pointwise /r-a.e.), we estimate 

l/W-Z^Wr < (2max{|/(x)|,|/„,(x)|})P 

<2P{\f{x)\ + \U{x)\)P<2P+^gix)P, 

so, already knowing that gP GL}, we may use (B.36) in the Dominated Convergence 
Theorem (with /„ replaced by f — fn^, and hence / replaced by the zero function) 
to conclude that j.^dll\f{x) - fn^(x)\P = 0, i.e., ||/-/„J|p 0. 

We continue for p = o°. For any fixed measurable subset E cX we define 

ll/llif) =sup{|/(x)| |xG£'}=inf{tG [0,°°] | |/(x)| <t\/xGE}. (B.94) 

If X\E has measure zero, as we assume in what follows, then 

ll/lir < ll/llH'\ (B.95) 

since E might be expanded to a larger set of measure zero, which might decrease 
the infimum in (B.88). It follows that convergence with respect to the norm || • 
implies convergence in || • We use this insight to prove the completeness of LT 
by reducing this to a limiting problem with respect to the norm || , for a suitable 

choice of £ CX. Namely, let (/„) be a Cauchy sequence in L°°. This means 

Ve>03nV,;,>„||/;-A||“^<e. 

Parametrizing e = Ijm for large m G N, and using (B.88), this implies: 

\/m3n\/j . /f (AV{j,i;,m)) and Vx G X\N^j j^u^'^ . |/)'(x) ^ 1/m. 
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Now define N = '^j.k.menN(j^k,m) ■ Since measures are countably additive by defini¬ 
tion and N is a countable union of the measure zero sets, N has measure zero. With 
E = X\N, so that X\E =N has measure zero, as above, we then have 

Vm3nV;,A:>„VxG£'|//(x)-/i(x)| < 1/m. 

Thus (/„) (strictly speaking, the corresponding sequence of restrictions of each /„ to 
E) is a Cauchy sequence of bounded functions on E in the supremum norm (B.94), 
so that we are back in the t°{X) case with X = E, with the three-step proof we 
gave; the pointwise limits f{x) = lim„^„o/„(x) exist, the function / thus defined on 
E is bounded, i.e., ||/||^^ < and /«—>■/ not just pointwise but also in the norm 
II • ||oo k Extending / from £ to X in an arbitrary way (the ensuing equivalence class 
in L°° does not depend on the behaviour of / on the null set X\E), we first conclude 
from (B.95) that ||/||^'‘ < °o, and secondly infer that fn^ f also in || • ||^®®. □ 

Without proof, we state some useful results about the place of continuous func¬ 
tions in L^-spaces. For simplicity, we assume that pL is regular and has support X (in 
that X has no open subset U with ii{U) = 0). In that case, Ch{X) and its subspaces 
Co{X) and Cc{X) may be seen as subspaces of L°°{X), on which the norm (B.88) or 
(B.90) simply reduces to the ordinary sup-norm (1.24). 

Theorem B.30. • If I < p < °°, then Cc(X) is dense in LP(X) (in the Lf-norm). 

• If p = °°, one has an inclusion of Banach spaces (all carrying the L°°-norm) 

CoiX)cCbiX)ciL^{X). (B.96) 

Compare (B.22). Since the closure Cc{X) is Cq{X), it follows that Cc{X) is dense in 
L°°(X) only in the exceptional case where L°°{X) = Cq{X) (e.g., for finite X). So in 
this respect, the values \ <p <o° behave quite differently from p = o°. 

The first claim is based on two facts, of which the first is true for all 1<P< OO^ 
whereas the second is valid only for \ <p (i.e. it fails for p = oo): 

1. The set S{X) of simple functions s = where ft(A,) < for each i, is 

dense in Lf(X); 

2. For each measurable subset A C X with ft (A) < and each e > 0 there is a 
function g G Cc{X) such that || 1^ — g||p < e. 

Similarly to Theorems B.27 and B.24, we know the state space of L°°{X, v): 

Theorem B.31. Let (X,Z,v) be a measure space. There is a bijective correspon¬ 
dence between states on L°°(X,v) and finitely additive probability measures jJ. on 
{X,Z) that are absolutely continuous with respect to V (i.e., v(A) = 0 implies 
p(A) = 0), given by (B.39) and (B.81). 

In this case, the role of the semiring is of course played by E, so that Step(X,£) 
is simply the complex linear span of the simple functions on (X,F), and (B.28) duly 
applies. Since it may once again be shown that Step(X,F) is dense in L°°{X, v), the 
definition (B.81) of integration “by continuity” makes sense in this situation, too. 
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B.7 Morphisms and isomorphisms of Banach spaces 

We often want to say that two Banach spaces are isomorphic. For example, in the 
next section the dual of a given Banach space is typically identified with some 
known Banach space; such identifications even belong to the nicest results in func¬ 
tional analysis. Of course, this issue is predicated on the correct definition of (not 
necessarily invertible) maps between Banach spaces in the first place. 

Definition B.32. A morphism a:V between Banach spaces V,W (or, more 
generally, normed spaces) is a bounded linear map, i.e., a linear map for which 
there is a constant C > 0 such that for each v €V, 

||av|lw<C||v||v, (B.97) 

or, equivalently, 

sup{||flv||w,vGy, llvllv < 1} <oo. (B.98) 

It is extremely important (yet easy to show) that bounded maps are automatically 
continuous (and even uniformly continuous); conversely, a continuous linear map 
between vector spaces with norm is bounded. We note two important special cases: 

• If W = y, a morphism a :V is called a (bounded) operator on V. 

• If Vy = C, a morphism ^ : y —C is called a (bounded linear) functional on V. 

Theorem B.33. Let V be a normed vector space and W a Banach space. The space 
B(V,W) of all morphisms (i.e., bounded linear maps) a :V is a Banach space 
with respect to pointwise operations (e.g., (Xa + b)v = Xav + bv), and the norm 

||fl|| =sup{||av||w,vGy,|lv||v < 1}. (B.99) 

Proof. Only completeness is nontrivial; the idea is that if (an) is a Cauchy sequence 
in B(V,W), we define a : y —W by av = lim„a„v. This limit exists, since we have 
\\anV — amv\\w < ||fln — flmllllrllv. Furthermore, it is easy to show (e.g., by con¬ 
tradiction) that a Cauchy sequence must be bounded, say ||fl„|| < K, and that, if 
fl„v w, then also ||a„v||w —>■ ||w||w. Hence ||flv||iy = lim„ ||a„v||iy < /r||v|lv/, so 
a G B(V,W). Finally, a„ —>■ a, since for ||v||v < 1 and, given e > 0, the usual N for 
which ||fl„ — flmll < e/2 for all n,m> N and ||flv — amv||w < e/2 for all m> N, 

||flv — a„v||w < \\av — a„v\\w + ||amV — a„v||w < je-l- 5 e = e. (B.lOO) 

Since this holds for any v G y with ||v||v < 1, eq. (B.99) gives ||fl —fl„|| < e. □ 

Clearly, if a G B(V,W), then one has the useful estimate, cf. (A.20), 

||av||w < ||a||||v||v, (B.lOl) 

and if W = y and a,b G B(V) = B{V,V), we also have (cf. (A.21)) 


\ab\\ < \\a 


(B.102) 
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Indeed, B{V) is a Banach algebra, which is just to say that it is a Banach space as 
well as an algebra, in which (B.102) holds (a C*-algebra will be a special case). 

Returning to our opening theme, the level of discourse now suddenly becomes 
quite advanced. We start with Banach’s famous Open Mapping Theorem. 

Theorem B.34. ifV and W are Banach spaces and a G B{V,W) is surjective, then 
a is open (in mapping open sets to open sets). 

Proof. For fixed m G V we write Vr{u) = {v G V ; ||m — v|| < r} for the open r-ball 
around u, with Vr = Vr(0) and hence Vr{u) = u + Vr. Furthermore, the closure of 
U CV is denoted by . Likewise for W. The theorem follows if aV\ =a{Vi) CW 
contains an open ball Wj, for some i > 0 (in which case, by linearity, aVr contains 
an open ball Wrs for any r > 0). By the theory of metric spaces, some subset U CV 
is open iff for any u GU there is r > 0 such that Vr{u) C U. Then aU contains the 
open set Wrs{au), and since au G aU is arbitrary, all is open by the same criterion. 

To prove that aVi contains an open ball, first note that since a:V is sur¬ 
jective, W = iJnoVn, so that by the Baire Category Theorem (which applies because 
Banach spaces are complete metric spaces by definition) some (flV„) contains an 
open set, and hence an open ball. Since a is linear this must then be true for all n; let 
us take n = 1, so that Ike(wo) C (aVi)^ for some wq G (aVi)^. Since any point in 
the closure of some U cW can be approximated by points in U, there is wi G aVi 
such that ||wi — wo|| < jfi. Hence for any w GWg /2 we have 

||(wi-w)-well < ||wi-woll-I-||w|| < ie-l-ie = e, (B.103) 

so wi — w G We{wo) and hence wi — w G (aVi)^ . Similarly, wi -l-w G (aVi)^. Since 
w = j(wi -fw) — ^{wi — w), we obtain w G (aVf)^ , for if G (aVi)^, then we have 
\{x±y) G (aVi)^. Since w G was arbitrary, it follows that We /2 C (aVi)^ . 

To produce an open ball in aV\ rather than in its closure, let Wq G We/ 4 , so that 
2 wq G We/ 2 . Hence there exists w\ G aVi such that ||2 wq — Wj || < e/4. And because 
2(2wo — W[) G We/ 2 , there exists Wj G aVi such that ||2(2 wq — Wj) — w^l < e/4, et 
cetera. Because 2(2(2 wq — w\) — w'f) G We/ 2 , there exists W 3 G aVi, ... 

Repeating this N times, we obtain a sequence (w/) in aVi such that for any A G N, 

\\2^w'q-2^-^w[ -2'w^_i -2%|| < e/4, (B.104) 

i.e., IIWq - 2^”w/|| < 2^^^^e. Letting N ^ then gives wf, = 2^"wJ,. 

Since wj, G aVi, there is a corresponding sequence {v'„) in Vi such that av'„ = w/, 
with ||v/|| < 1 for each n. Hence we may estimate ^^= 1 1|2 ”v/|| < Lr=i2 " = 1, 
so the series 2^”v/ in V is absolutely convergent and hence convergent. Since V 
is assumed complete, it has a limit v' = £^= 1 Since 

N 00 

<EI|2-XII<EI|2-"v:iI<i, 

n=\ n=\ 

letting 00 gives ||v^|| < 1, or v'€ Vj”. Since a is bounded and hence continuous, 


N 


E2- 
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av' = a ^ 2-\'„ = ^ 2-”av; = ^ 2-«< = (B.105) 

^^=1 J n=\ n=\ 

We now recall that w'q G Wg ^4 was arbitrary, so we have shown that We /4 C a {¥['). 
By linearity of a, it follows that W^ C aVi for any s < e/4. □ 

Corollary B.35. Let V and W be Banach spaces. The (set-theoretic) inverse a ^ of 
a bijective morphism a G B(V,W) is automatically linear and bounded. 

In other words, lies in B{W,V). Corollary B.35 suggests defining two Banach 
spaces V and W to be isomorphic if there exists a bijective morphism a G B{V,W) 
(in which case they would be isomorphic as objects in the category of Banach spaces 
with bounded linear maps). However, we often prefer to use a sharper notion. 

Definition B.36. Let V and W be normed spaces. 

1. An isometry /rom V to W is a linear map u :V satisfying 

l|av|k = ||v|lv, vGV. (B.106) 

2. An isometric isomorphism /rom V to W is a surjective isometry u :V ^W. 

Since an isometry is clearly bounded as well as injective, by Corollary B.35 a sur¬ 
jective isometry has a bounded linear inverse, which is easily seen to be isometric, 
too. In practice, it is the conditions in Definition B.36 that one typically checks. 

Nonetheless, the non-isometric case is also quite important. As a case in point, we 
prove a classical result of functional analysis, called the Closed Graph Theorem. In 
preparation, note that two normed spaces V, W define a third one, called their direct 
sum V (BW, which as a set is V x W, turned into a vector space by the operations 
(vi,wi) -I- (v 2 ,W 2 ) = (vi -f V 2 ,wi -I-W 2 ) and /(v,w) = {Xv,Xw), etc., with norm 

||(v,w)|| = ||v||v + ||w||w (B.107) 

It is easily shown that if V and W are Banach spaces, then so is V © W. 

Furthermore, if a : V —?► W is any linear map, the graph of a is the vector space 

G{a) = {(v,av),vGV}cV®W. (B.108) 

If a is bounded, then G{a) is closed (i.e. in the norm inherited from the Banach 
space V © W). The converse, then, is the Closed Graph Theorem: 

Theorem B.37. Let V and W be Banach spaces and let a : V W be a linear map. 
If the graph G{a) is closed (in the norm inherited from V (BW), then a is bounded. 

Proof. Let b : G{a) —V be the linear map (v,av) 1 —v, which is clearly a bijec- 
tion, with inverse : V —( G(a), h^*(v) = (v,flv). Furthermore, ||h(v,av)|| = 
||v||v < ||v||v + ||av|| ly = ||(v,av)|l, so b is bounded. Hence Corollary B.35 makes 
bounded as well, i.e., ||h^^(v)|| < C||v||v for some C > 0. Hence ||(v,flv)|| = 
||v||v + ||av||w < Cllvllv- So ||av||w < (C— l)lkllv> hence a is bounded. □ 
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B.8 The Hahn-Banach Theorem 

In this section we present another traditional pillar of functional analysis. 

Definition B.38. A sublinear functional on a real vector space V is a map p :V ^ 

R that for each v, w G V and scalars t >0 satisfies 

p{v+ w) < p{v) + p{w); (B.109) 

p{tv)=tp{v). (B.llO) 

We will deal with two examples of such functionals. One is simply a norm (even on a 
complex vector space, which in particular is a real vector space). For the other, recall 
that a subset of a real vector space V is called convex if whenever v,w G K and 
t G (0,1), one has fv + (1 — t)w G K. Even without a topology on V, we can define 
an interior point of K (or indeed of any subset of V) as a point v G K such that for 
each v' G y there is e > 0 such that v + tv' G K for any 0 < t < e. We denote the 
set of interior points of K by int(fG). For example, if V is normed (with associated 
topology), or is the dual of a normed space equipped with the w*-topology (or, 
even more generally, if V is a topological vector space, i.e., a vector space carrying 
a Hausdorff topology in which addition and scalar multiplication are continuous), 
then each point of an open set U is interior in the above sense, so that U = int(t/). 

Let K cV he convex and suppose it contains 0 as an interior point. Then the 
indexfunctionallMinkowskiM/n^OH’s^t functional (also called gauge) p :V ^ K+ 
of K is defined by 

p(v) = inf{a > 0 I v/a G fG}. (B.lll) 

Note that p{v) < because 0 G fG is interior, so that there is e > 0 such that ev G K, 
and hence a = Ije lies in the set in (B.l 11). It is clear that \iv GK, then a = 1 lies 
in the set in (B.l 11), so that p{v) < 1. As a simple example, for the (open or closed) 
unit ball B in a normed space (both of which are convex), we have p{v) = ||v||. 

Proposition B.39. Let K <zV be convex and let Q G K be an interior point of K. 
Then the Minkowski functional p ofK satisfies (B.109) - (B.llO). Furthermore, we 
may recover the set int(/r) of interior points ofK through 

int(/:) = {vGy |p(v) < 1}. (B.l 12) 

Conversely, if some function p : V —>■ R+ satisfies (B.109) - (B.llO), then the set 

/:={vGy |p(v) < 1} (B.113) 

is convex, with interior given by (B. 112). 

For example, if K is open (in a topological vector space), then (B.l 12) equals K. 

Proof Given (B.l 11), eq. (B.l 10) is obvious. To prove (B.109), find a > 0 and h > 0 
such that v/a G K and w/b G K-, cf. the comment after (B.lll). Since K is convex, 
with t = aj (aFb) and hence 1 — f = h/(a + h) we have f • v/a -I- (1 — f) • w/h G fG. 
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Hence p{t-v/a-\-{l — t)-w/b) < 1, which, using (B.l 10), reads p{v + w) < a + b. 
Taking the infimum over a and b constrained hy vja£K,wjbGK then turns the 
right-hand side into p{v) + p(w), so that we have proved (B.109). 

The proof of the converse claims is almost trivial, except perhaps for the last 
claim. To prove that p{v) < 1 implies v G int(K), we note that for any v' G V and 
e > 0, from (B.109) - (B.l 10) we have p(v + 8v') < p(v) +ep{v'). If p{v') = 0, this 
gives p(v + ev') < p{v) < 1, so that v + ev' G K. If not, assume p{v) = I — 5 for 
some d G (0,1], and we find that p(v + ev') < I for any 0 < e < 5 /p{v'). □ 

Having motivated Definition B.38, we now state the Hahn-Banach Theorem: 

Theorem B.40. Let V be a real vector space equipped with a sublinear functional 
p, and let W GV be a linear subspace carrying a linear map (pw ; IT —>■ K that is 
dominated by p in the sense that for each w GV we have (pw(w) < p{w). 

Then (pw has a linear extension ^ ; V —>■ M that for each v GV satisfies 

(p{v)<piv). (B.l 14) 

Proof Take v\ GV ,v\fi W, and extend (pw to IT © K • vi by 

(p{w + tvi) = (pwiw)+t(p{vi), (B.115) 

with f G K and ^(vi) to be determined. In order to satisfy (B.l 14), we need 

(p{w+ tvi) < p{w+ tvi), (B.116) 

for each w GW and f G R. Using (B.l 10), this is true iff it is true for f ± 1, which 
yields two conditions (in two variables w,w' G IT), which may jointly be written as 

<p(w') — p(w' — vi) < <p(vi) < p(w + vi) — <p(w). (B.l 17) 

Since (p is linear, this can obviously be satisfied by some ^(vi) G R iff 

(p{w + w') < p{w-\-v\) -\-p{w' — v\), (B.l 18) 

which is indeed the case: for by assumption we have (p{w + w') < p{w+w'), whence 

(p{w + w') < p{w-\-v\ + — vi) < p{w -\-v\) + p{w' — vi), (B.l 19) 

where we used (B.109). Hence any choice of ^(vi) that satisfies (B.l 17) provides 
an extension (B.115) of ^ to IT©R • vi, which by construction satisfies (B.l 14). 

Lovers of Zorn’s Lemma may now complete the proof as follows. Let F be the 
set of all pairs {(p,X), where X C T is a linear subspace and ^ R is a linear 

extension of (pw that satisfies (B.l 14). We partially order F by 

{(pi,Xi) < i(p 2 ,X 2 ) iffXi CX 2 and ^i(v) = ^ 2 (v)VvGXi. (B.120) 

Then F is clearly nonempty, and every totally ordered subset {{Xi,(px)} of F has 
an upper bound ((p,X), where X = U^X^ and ^(v) = (pi{v) whenever v GX^. 
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Thus Zorn’s Lemma applies, “giving” a maximal element {(p,Z). If Z^V, one may 
extend Z by the first step of the proof (applied to W Z), contradicting maximality 
of {(p,Z). Hence Z = V, and (p is the desired functional. □ 

If y is finite-dimensional, then Zorn’s Lemma is unnecessary, and a constructive 
proof may be given by repeating the first step of the proof a finite number of times. 

Corollary B.41. Let V be a normed vector space, with dual V*, and letW <ZV be a 
linear subspace (inheriting the norm from V, with associated dual W*). 

Then each (pw G W* has an extension (p GV* toV with the same norm. 

Proof. We take p(v) = ||^||||v||, which clearly satisfies (B.109) - (B.llO). If V is 
real. Theorem B.40 gives ^ : V —>■ R satisfying |^(v)| < ||^|| ||v|| for each v GV, 
and hence ||^|| < ||^||. But ||^|| < ||^|| since W CV, hence || 9 || = ||^||, 

If V is complex, we first regard it as a real vector space, take the real part (p^ of 
(pw, and isometrically extend (p^ to a linear functional : X —R as above, so that 
Ill'll = \\(Pw\\. Then define (p : X ^ Chy 

(p{v) = (p'{v)— i(p'{iv). (B.12I) 

One checks that (p{{s + it)v) = (s + it)(p{v). Since (p'{v) is the real part of ^(v), 
with \ (p(v)\^ = \(p'{v)\^+ \(p'(iv)\^, we have |^'(v)| < |^(v)| and hence ||^'|| < ||^||. 
Conversely, for any v with ^(v) f 0, take z = |^(v)|/^(v), so that |^(v)| = (p{zv). 
Hence (p{zv) is real and therefore it is equal to its real part, so that, since |z| = 1, 

(p{zv) = (p'izv) < ||^'||||zv|| = II^'IIIMI. 

Therefore, ||^|| < H^'H, and hence ||^|| = ||^'||. The same computation applies to 
(pw, yielding ||^|| = \\(p{v\\, so that finally ||^|| = ||^'|| = \\(p^ = \\(pw\\. □ 

In fact, this trick to pass from the real to the complex case was overlooked by Hanh 
and Banach themselves, whose arguments were much more involved. 

As to Zorn’s Lemma, if V is infinite-dimensional but still separable, using (count¬ 
able) induction one may construct a sequence (v„) of linearly independent unit vec¬ 
tors in y\iy, such that V is the closed linear span of W and the v„. The above 
procedure then gives (p in the real algebraic linear span of W and the v„, which is 
bounded by construction and may be extended to all of V by continuity. However, 
the construction of (v„) still requires a weaker form of the Axiom of Choice (which 
is equivalent to Zorn’s Lemma), namely the so-called Axiom of Dependent Choice. 

In the situation of Corollary B.41, the extension (p is unique iff the normed space 
V is strictly convex, which by definition means that its unit sphere is strictly convex, 
i.e., if ||v|| = ||w|| for V ^ w and t G (0,1), then ||fv-|- (1 —Otvll < 1- Equivalently, if 
Ikll = Ikll = 2l|v+w|i, then V = w. This is the case, for example, in Hilbert spaces 
H, as easily follows from the comment after (A.3). Indeed, anticipating Theorem 
B.66, if ly C // is closed (as we may assume, since (pw is continuous), we may 
identify (pw ; W —C with some vector (pw G W, and if we do, the unique extension 
(p : H ^ C corresponds to the same vector (pw, now regarded as an element of H. 
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Corollary B.42. Let V be a normed vector space, with dual V*, and fix some 
nonzero vector vq G V. There exists afunctional (p GV* such that 

<P(vo) = Ikoll; (B.122) 

||<Pll = 1. (B.123) 

Proof. Take W = C • vq in Corollary B.41, so that || ^ || = 1 by construction. □ 

We now turn to an application of Theorem B.40 to convexity theory, which we 
will need for the Krein-Milman Theorem (and hence eventually for the existence of 
pure states on C*-algebras). Although we will apply the lemma below to the dual of 
a normed vector space in its w*-topology, the setting is more general; all we need is 
a few easily established facts for topological vector spaces V, namely that ifl/CV 
is open, then so is every translate U + v of U, and so is eU, for any e > 0 , and 
hence also {—eU) fl (sU). Furthermore, a linear map ^ ; V —>■ K is continuous iff it 
is continuous at 0. These elementary facts will be used in the proof below. 

Theorem B.43. Let V be a real topological vector space and let A and B be dis¬ 
joint nonempty convex subsets ofV, with A open. Then there is a continuous linear 
functional ^ ; V —>■ M and some f G K such that (p{a) < t < (p{b) for all a G A,b G B. 

Proof. From C = A — B = {a — b \ a G A,b G B}, which is convex and open (as it 
is a union of open sets A-\-b over b G B). Then move C so that it contains 0, by 
taking any oq G A and bo G B and defining K = C -\-vo, with vq = bo —oq. Thus K 
has its associated Minkowski functional p^, cf. (B.lll). Noting that vo K (since 
A OB — 0), we have pk{vo) > 1- With W = M • vq, define a functional (pw ; W —>■ K 
by ^(svo) = s for s G M. This implies (pw{v) < Pk{v) for v G M - vq: if v = svq with 
s > 0, this is obvious from (B.llO) and ^(vo) = 1, and if s < 0, then ^(v) < 0 
whereas pk{v) > 0. We now use Theorem B.40 to extend (pw to a functional (p : 
y —>■ M satisfying (B.114), which implies ^(v) < Pk{v) < 1 for any v G K. Taking 
V = a — b-fvo gives ^(a) < ^{b) for any a G A,b G B. Taking t = inf{(p{b) \ b G B}, 
the last claim of the lemma follows. Finally, since ^(v) < 1 for each v G K, we have 
(p^^{—e,e) C {—sK) n (eK), which is open, so that (p is continuous. □ 

This is the precise result we will need, but variations abound. If A and B are 
open, in which case <p{B) is open, we have (p{a) < t < (p{b). If V is locally convex, 
in that its topology has a basis consisting of convex sets, then if A is closed and 
B is compact, there are disjoint open convex sets A' and B' containing A and B, 
respectively, so that also in this case we obtain the strict inequalities just mentioned. 

Finally, even if V has no topology, we can still show that <p{a) <t < (p{b) on the 
mere assumption that A has an interior point {(p then lacks continuity, of course). 

Result like this are often called separation theorems. Namely, a plane H in 
always takes the form xo +ker^ = where xo G and ^ —>■ M is a 

(nonzero) linear map. Equivalently, H — (p^^ (c), where c = (p{xo). More generally, 
a hyperplane in a vector space V is a (nonempty) subspace of the form H — (p^^ (c), 
where 9 is a linear functional on V ; clearly, H has codimension one and if V is a 
topological vector space and (p is continuous, then H is closed. So Theorem B.43 
shows that A and B are separated by the closed hyperplane H = <p^ * (f). 
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B.9 Duality 

We now turn to duality theory. For any normed (but not necessarily complete) vector 
space V, Theorem B.33 shows that the space V* of all morphisms ^ ; V — C is a 
Banach space, called the dual of V. By (B.99), the norm of <p €V* is given by 

ll^ll =sup{|^(v)|,vGy,||v||v < 1}. (B.124) 

Any morphism fl G B{V,W) induces a dual morphism a* G B(W*,V*) by 

(fl»(v) = (p(av), (p G W*. (B.125) 

By definition of the various norms involved here, we find 

||a*|| = sup{|(p(flv)|,(p G W*,v G y, M = ||v|| = 1}. (B.126) 

Since \(p{av) < ||^||||av|| < ||a||,this immediately yields 

||a*||<||a||. (B.127) 

In fact, one even has 

||fl1l = ||fl||, (B.128) 

but unexpectedly heavy machinery (namely the Hahn-Banach Theorem) is required 
to prove this. By Corollary B.42 (applied to W), for any v G y, there exists (p G W* 
with ll^ll = 1 and (p{av) = ||av||, so from (B.126) we have ||fl* || > ||av|| for any v G y 
with ||v|| = 1. Taking the supremum over such v and using (B.99) gives ||a*|| > ||fl||. 
With our earlier (B.127), this gives (B.128). 

Another application of Corollary B.42 lies in the double dual V** = {V*)*. 

Proposition B.44. For any normed space V, the map v i—>■ vfrom V to y**, given by 

v((p) = (p(v), (pGV*, (B.129) 

is isometric (and hence injective), mapping V onto a closed subspace V C y**. 

This will follow from part 1 of the following consequence of Corollary B.42: 
Corollary B.45. Let V be a normed vector space, with dual V* 

1. For any v G y, one has 

||v||=sup{|(p(v)|,(pGyM|(p|| = l}. (B.130) 

2. For any w ^v, there exists (p GV* with (p(w) ^ (p(v). 

3. For any a G B{V,W), we have 

||a|| =sup{|T(av)|,vGy,TGiyMlv|| = ||t|| = 1}. (B.131) 

Proof. This is the proof of Corollary B.45. 
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1. If ll^ll = 1, then |^(v)| < ||v|l, so the supremum is < ||v||. But according to 
Corollary B.42 the supremum is > ||v||. 

2. Take vq = v — w in Corollary B.42 and use the previous item. 

3. Apply part 1 in W to ||av|| and use (B.99). □ 

Proof. And this is the proof of Proposition B.44. Note that ||v|| < ||v||, since 

||v||=sup{|(p(v)||,(pGyM|(p|| = l}, (B.132) 

and |^(v)|| < ||^||||v|| = ||v||. Corollary B.42 shows this bound is saturated. □ 

If V is finite-dimensional. Proposition B.44 gives a natural isomorphism V** = V, 
in contrast with the “unnatural” isomorphisms V* =V that require the choice of a 
basis (this terminology is made precise in category theory, see Appendix E). 

In addition to their (metric) topology coming from the norm, both V and V* natu¬ 
rally carry another topology (which will be of great importance in operator algebras 
and hence in quantum theory), defined in an almost identical way: 

• The weak topology on V is the weakest topology that makes all functions (p :V ^ 
C continuous, (p GV*. Equivalently, one has convergence v„ —v (of sequences, 
or, more generally, of nets) iff ^(v„) —> ^(v) for each (p G V*. 

• The weak* topology (or w*-topology) on V* is the weakest topology that makes 
all functions v : V* — C continuous, v GV. Equivalently, it is the topology of 
pointwise convergence, in that (pn^ <p iff ^(v„) —>■ ^(v) for each v GV (etc.). 

As their names suggest, these topologies are weaker than the norm topologies (ex¬ 
cept when V is finite-dimensional): indeed, if ||v„ — v|| —>^0 and (p G V*, then cer¬ 
tainly \(p{Vn) - (Piv)\ — II ^11 lk« ~ ^11 0, and similarly for V*. Consequently, a 

functional ^ : V — C is norm-continuous if it is weakly continuous, but the con¬ 
verse may be false. Nonetheless, the weak dual of V coincides with its norm dual, 
and we combine this with a contrasting result for the weak* continuous functionals 
V*, which en passant locates the image V of V in V** under (B.129): 

Proposition B.46. • Any functional (p GV* is weakly continuous. 

• Afunctional 9 G V** is weak* continuous iff 9 GV. 

We just mention that, because of Corollary B.45.2, this proposition is a special case 
of a very general result on topological vector spaces. Namely, let V and W two 
vector spaces in separating duality, that is, there is a bilinear form 

:y xW^C 

such that for each v, v' GV there is w GW with (v,w) (v',w), and for each 

w,w' GW there is v G y with (v,w) ^ (v,w'). Then V can be given the so-called 
(T{V,W)-topology, which is the weakest topology making each map v i—(v,w) con¬ 
tinuous (w G W), and W likewise carries the a{W,V) topology (sometimes also 
called the (7(y,iy)-topology). In particular, the weak topology on V is just the 
(7(y,y*)-topology, whereas the weak* topology on V* is the (7(y*,y)-topology. 
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Theorem B.47. Let V and W be vector spaces in separating duality. The space of 
(7 {V,W)-continuous linear functionals on V coincides with W, and likewise, the 
space of (7 {W,V)-continuous linear functionals on W coincides with V. 

This follows from elementary topology, and hence omit the proof. From this point of 
view, the apparent difference between the two parts of Proposition B.46 originates 
in the fact that the weak* topology on V* is defined by its separating duality with V 
(or, equivalently, with V), rather than its separating duality with V**. 

Next, the Banach-Alaoglu Theorem shows an unexpected but important prop¬ 
erty of the weak* topology (a least when V is infinite-dimensional). For example, in 
quantum theory this theorem implies w*-compactness of the state space, and this, in 
turn (through the Krein-Milman Theorem), leads to an abundance of pure states. 

Theorem B.48. IfV is a normed vector space, any d-ball 

y*d={^&y\\w\\<d} (B.133) 

is compact in the weak* topology. More generally, ifU is any neighborhood ofO in 
V, the set Vf = {(p G V*,\(p{x) \ <d\/v GU} is w*-compact. 

Clearly, U = Vi yields (B.133). Omitting the proof, we just note that the first claim 
is based on the fact that is a closed subset of the space 

ni^Gci izi <£/iivii}, 

vev 

which is compact by Tychonoff’s Theorem in topology (such reliance on awful non¬ 
constructive results is unfortunately typical of traditional functional analysis). 

After this abstract theory, it is high time to turn to some examples; see Table B.l. 


No. 

V 

V* 

V*-V-pairing 

comment 

1. 

Co{X) 

Mix) 


X locally compact Hausdorff space 

2. 

Cb{X) 

Mipx) 


PX Cech-Stone compactiflcation of X 

3. 

eo{x) 

eW) 

{f,g) 

X countable set, f o (N) often called co 

4. 


r(x) 

{f,g} ^'Lxexf{x)g(x) 

X countable set 

5. 

r(x) 

baiX,tP(X))c 

{h,g) = Ix‘ihg 

bounded finitely additive signed measures on X 

6. 

ep{x) 

e^x) 

{f,g) ^'Lxexf{^)g{x) 

4 + 4 = 1, p, ^^ 1 , 00 , X countable 

7. 

iHx) 

e^x) 

{f,g} ^'Lxexf{x)g(x) 

tf treated as a Hilbert space 

8. 

H 

H 

{f,g} = {f,g)H 

H general Hilbert space 

9. 

L\X) 

L^X) 

{f,g) ^ .fx^Bfg 

if treated as a Hilbert space 

10. 

lW 

L“(X) 

{f,g) ^ Ix^Bfg 

(X,Z,p) CT-finite measure space 

11. 

LP{X) 

L^iX) 

{f,g) ^ fx^hfg 

4 + 4 = l,p,^^ 1,00 

12. 

Ba{H) 

BfH) 

(p,a) =Tr(pa) 

Bo{H) compact operators, Bi{H) trace class 

13. 

Bl(H) 

B{H) 

{a,p) =Tr(pa) 

B{H) bounded operators on H 

14 

M, 

M 

(a,(p) = (pia) 

M, predual of von Neumann algebra M 


Table B.l Some Banach spaces and their duals, up to isometric isomorphism 
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1. The first entry is Theorem B.24. 

2. This one is true by definition if we define the Cech-Stone compactification j3X 

of a locally compact (Hausdorff) space as the Gelfand spectrum of as a 

commutative C*-algebra, or, equivalently, by 

Q(X)^C(j3X); (B.134) 

The compact Hausdorff space j3X then has the feature that each / G Ch{X) has 
a unique continuous extension to pX. More generally, let X be a topological 
space. Provided it exists, “the” Cech-Stone compactification of X, denoted by 
j3X, is a compact Hausdorff space together with a continuous map fix - X j3X 
such that for each compact Hausdorff space K and each continuous function 
f : X ^ K there is a unique continuous function j3/ ; fiX K such that the 
following diagram commutes: 


X^I3X 



K 


This universal property makes j5X unique up to homeomorphism (if it exists). 
If X is locally compact Hausdorff, then j3X exists and fix is injective, making 
j3x{X) = X a dense subspace of pX. The above diagram then implies (B.134) 
through / H-> j3/; just take K = Ran(/)^, which is compact since / is bounded. 
Specializing this case to arbitrary sets X seen as discrete topological spaces, we 
can give an explicit description of j3X as the set of all ultrafilters on X. 

Definition B.49. Let X be any set (seen as a discrete topological space). 

• A filter on X is a non-empty collection F of subsets ofX such that A G F and 
B G F implies ADB G F, A G F and A <G B implies B G F, and finally % F. 

• An ultrafilter is a filter that is maximal in the set of all proper filters F (i.e. 
F ^^(X)\%), ordered by inclusion. It is straightforward to show that a filter 
F is maximal iff one and hence all of the following equivalent conditions hold: 

a. for any A C X we have either A G F or A'^ G F; 

b. if A UB G F, then A G F or B G F (i.e., F is prime); 

c. if A DB f id for all B G F, then A G F. 

• For any xGX, the set Ux of all subsets ofX that contain x forms an ultrafilter, 
called principal; any ultrafilter not of this kind is called free | = oo^ the 
existence of free ultrafilters on X follows from Zorn’s Lemma). 

For discrete X, the set of all ultrafilters on X, endowed with the topology gener¬ 
ated by all sets of the form Ua = {U G j3X \AgU}, where A C X, is a realization 
of the Cech-Stone compactification of X, and may therefore be denoted by fiX. 
Note that each Ua is clopen in j3X. The embedding j3x maps xGXto the principal 
ultrafilter Ux, and the continuous extension j5f of /: X —> is given by 
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j3/(t/)=lim/= n/(Ar, (B.136) 

Aeu 

Theorem 4.24 then explains the pairing in no. 2 of Table B.l (see also no. 5). 

3. • This is a special case of no. 1, since £o{X) = Co{X), given thatX is discrete (as 
a topological space). We then use the (Lebesgue-) Radon-Nikodym Theorem 
of measure theory: if (2f is a CJ-finite measure space and v is a complex 
measure on X that is absolutely continuous with respect to /r (i.e., /4(A) = 0 
implies v(A) = 0, A G X), then there is a function dv/djJL € L} (2f) such that 

[ dvf= [ dll ^f, f G L~(X). (B.137) 

Jx Jx dfl 

In the case at hand, X is countable and ji is the counting measure, with respect 
to which any measure is absolutely continuous. This yields M{X) = f * {X). 

• Secondly, this duality is also a special case of Theorem B.27: as in (B.78), 

iQ{x) = r{x,g^f{x)), (B.138) 

so that bounded hermitian functionals (p : £o{X) ^ C (which in this case cor¬ 
respond to bounded real-linear functionals ,R) —R) are given by 

(p{g) = lim / djlSn, 

where g G £o{X), (sn) is a sequence in Step(2f, £^^f{X)), which simply consists 
of functions on X with finite support, and /i is a finitely additive bounded 
signed measure on £^f{X), which is given by its values on any singleton xGX 
and hence is just a real-valued function 

/(x) = /t({4); (b.139) 

boundedness of fi gives f £ £^{X). Writing X = U„Xn, where the Xn are finite 
and X„ C X„^i (e.g., for 2f = N one may take X„ = n}, so that T.x€X„ — 

K=i)’ — f\x„ on X„ and s„{x) = 0 outside X„, which gives 

(p{g) = lim ^ f{x)g{x) = ^ f{x)g{x). (B.140) 

One easily verifies that indeed ||/||i = ||/4||, since (B.56) - (B.57) yield 
||/l|| =sup{/i+(A)4-/i_(A) |A G =sup|^ |/(x)|,A G ^/(X)| , 

whose right-hand side in turn is equal to ||/|| i. 

• As a third approach, we give a direct proof of the desired duality 

£q{X)*'^£\X). (B.141) 
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To Start, for / G (X) and g € £o(X), we define an expression (pf(g) by 

<P/(^) = (/,^)= (B.142) 

By the obvious estimate 

l<P/(^)l<ll/lli||^ll~, (B.143) 

which is Holder’s inequality for p = 1 and q = °°, the sum (B.142) is abso¬ 
lutely convergent, and hence defines a linear map (pf : £o{X) ^ C, which satis¬ 
fies \\(pf\\ < ll/ll 1 - Thus the map / h -^ (pf is well defined from (2f) to £q{X)*. 
To prove surjectivity of this map, for given <p G £o(X)*, define f : X ^ Chy 

f(x) = (p(d,). (B.144) 

It follows from continuity of <p that <p — <pf, cf. (B.140), but it remains to be 
shown that / G (2f). To do so, for each n G N we define (p„ : £o(X) —>■ Chy 

%(g) = E (B-145) 

x^Xfi 

This operator is bounded, with 

||<Pn|| = lk«||i, (B.146) 

where s„ was defined prior to (B.140). To see this, we have 

||<P„|| < Iklli, (B.147) 

from (B.143), whereas the opposite inequality follows from a trick: define 

gn(x) = fix)/\f{x)\ {x G XnJix) ^ 0); (B.148) 

gn{x) = 0 (otherwise), (B.149) 

so that, assuming ^ ^ 0, we have ||gn||oo = 1 and <Pn{gn) = llsnllG hence 

||<P«||>lkl|i. (B.150) 

Since <p(g) ~ <Pf(g) is finite by assumption, as in (B.140) lim„^„o <Pn(g) exists 
for each g G £o(X). Hence lim„^oc ||^„(g)|| exists, so sup„{||^„(g)||} < oo. 
The Principle of Uniform Boundedness (cf. Theorem B.78 below) then gives 
sup„{||^„||} < oo, and this supremum equals sup„{||s„||} = ||/||i. 

Comparing the first two approaches, we see that bounded additive mea¬ 

sures on £^f{X) bijectively correspond to bounded a-additive measures on 
£^{X), both of which in turn are given by positive functions f G £^{X). 


“PuXtJC. T^flxLLltXLMtXLtljCjaJ. T^lLy-A-LC-A. 



B.9 Duality 


551 


4. This is similar to the third proof of the previous case. For f G t^{X) and g G 

(X), we define (Pf{g) by (B.142), and instead of (B.143) we now obtain 

l<P/(^)l<ll/IUklli- (B.151) 

Thus we have a map / h->■ (pf from t"{X) to £i{X)*, satisfying ||^|| < ||/||oo. 
To prove surjectivity, for some (p G {X)* we once again define / : X —>■ C by 
(B.144), so that (p ~ <pf by continuity. Then for any x G X,we obtain |/(x)| < 
ll^llll^rlli = ll^ll, SO ll/lloc < ll^ll and hence ||^|| = ||/||„. In particular, / G 
€°{X) and the bijection (pf gg / gives an isometric isomorphism a la (B.141): 

(B.152) 

5. Similar to no. 3, this is a special case of two more general dualities, namely 

r(X)* (B.153) 

riX,R)* ^ha{X,^{X)), (B.154) 

cf. no. 2, and Theorem B.27, respectively. Thus bounded additive mea¬ 

sures jj. on X (with underlying semiring £% = £^(X)) bijectively correspond to 
bounded a-additive measures Up on pX (equipped with the Borel a-algebra) by 

[ di^f= [ dpippf, (B.155) 

for any f G £°°{X). This is not as surprising as it seems, because there is a bijec- 
tive correspondence between ultrafilters U on X and finitely additive probability 
measures jj. on X that take values in {0,1}. This correspondence is given by: 

U = {AcX\pi{A) = l}; (B.156) 

£l{A) = 1 iffAGU. (B.157) 

Principal ultrafilters Ux thereby correspond to Dirac measures 5x on X, whereas 

free ultrafilters U correspond to (finitely additive) measures /ij/ on X that vanish 
on any finite subset of X. For general ultrafilters U G PX we have, for / G £°°{X), 

[ dllf= p\fiA)-, (B.158) 

Aeu 

where f{A) = {/(x) | x G A} as usual, and f{A)^ is the closure of this set in C. 
Thus (B.158) is equal to the unique z G C with the property that for each e > 0 
the set {x G N : |/(x) —z| < e} lies in 17; for U = Ux, this recovers z = f{x). 

6. This is similar to nos. 3 and 4, but is slightly more involved. For f G £‘^{X) and 
g G £^(X), with (B.16) and p,q ^ l,oo, we again define ^/(g) by (B.142), upon 
which Holder’s inequality yields ll<P/ll< ll/ll,. Conversely, for (p G £^(X)*, once 
again define / by (B.144), so that (p = (pf. We now show that ll/ll^<ll<Pll- 
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PickX„ C X as defined below (B.139), and define /„ : X —>■ C by /„(x) = f{x) if 

xGXn and/„(x) = 0 ifx ^ If ||/||^ < oo, then ||/||^ = sup„ ||/„||^. Now define 

gn(x) = IfnixW/fnix) (/„(x) 0); (B.159) 

8nix) = 0 {fnix)=0). (B.160) 

Using (B.142), we obtain 

ll/«ll.ll/«lir‘ = ll/«ll^ = (fn^8n) = (P(gn) < M |k„||p = ||<p|| \\fn\\t\ (B.161) 
whence ||/„ ll^<ll<Pll- Taking sup„ gives ||/||^< ||^||, and hence 

eP{X)* ^£‘>{X). (B.162) 

7. p = q = 2 stands out as a special, self-dual case. As the next item explains, 
this is because £^{X) is a Hilbert space with inner product (B.IO). This differs 
from the pairing (B.142) by the complex conjugation of the first term, making 
it appropriate to redefine the pairing between £^{X)* and £^{X) in terms of the 
inner product. This leads to an antilinear isometric isomorphism £^{X)* = £^{X), 
as opposed to the linear isometric isomorphisms for all other values of p,q. 

8. Proposition A.5 generalizes to infinite-dimensional Hilbert spaces (in which case 

it is often named after Riesz and Frechet), with the following additions to the 
proof. First, boundedness of / guarantees that ker(/) is a closed subspace of H, 
so that (if / 7 ^ 0) the orthogonal complement ker(/)^ is not empty by Proposition 
B.57 below. Second, uniqueness of the representing vector \j/ in (A.13) now needs 
to be shown. This is easy: if {'^,(p) = for all (p G H, then, taking (p = 

Iff — V/', it follows that (y/ — w', y/ — V^') = II~ = 0’ hence = iff. 

No. 9 follows from no. 8, whilst 10 and 11 are similar to 4 and 6, except for some 
tricky measure-theoretic details. We only sketch the main idea (where for simplicity 
we assume p is finite; using an approximation procedure the result is valid also 
for the (7-finite case, but not beyond!). Namely, the function / representing the 
functional (p G LP{X)* is constructed by first defining a complex measure v on Z 
by v(A) ^ (p{Ia), A G E. Using (B.85), we see that v is absolutely continuous with 
respect to p, and we put 

f = dv/dp. (B.163) 

Using definition (B.29) of integration, this yields 

(p{g) = {f,g)= I dpfg, (B.164) 

Jx 

and similar arguments as in the discrete case show that f GLg{X). 

Nos. 12-13 follow from Theorem B.146 below, and no. 14, which is forward- 
looking, too, is true by definition of the predual of a von Neumann algebra (whose 
existence is highly nontrivial); see Theorem C.132 in Appendix §C. 
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Returning to the abstract theory, we now apply the Hahn-Banach Theorem and du¬ 
ality theory to prove one of the most beautiful results in functional analysis. 

The boundary d^K of a convex set K consists of all v G /f satisfying: 

ifv = tw+{\— t)xfor certain w,x G K and t G (0,1), then v = w = x. 

Hence Caratheodory’s Theorem 1.12, which, we recall, states that if fGis a nonempty 
compact convex subset of M”, then dgK ^ 0, and each point of fG is a convex sum 
of at most n + 1 points in d^K, implies, in particular, that d^K is not empty. This is 
readily visualized: the simplest example is K = [0,1], where d^K = {0,1}. One also 
has triangles in the plane, whose boundaries consist of their vertices (rather than 
their sides, which are among their/aces, see below). Furthermore, the closed (unit) 
three-ball in is convex, with boundary dgB^ = S^, cf. Proposition 2.9. In these 
examples the interior of K, which is still convex, would have an empty boundary, so 
that the assumption of compactness in Theorem 1.12 is absolutely essential. 

Caratheodory’s Theorem follows from a straightforward induction argument in 
the dimension of K, and the following Krein-Milman Theorem. The convex hull 
co{X) of a subset 2f of a vector space is defined as the set of all convex sums fx-l- 
(1 — t)y, where t G (0,1) and x,y G this is the smallest convex set containing X. 

Theorem B.50. Let V be a real normed vector space with dual V*, and let K be a 
convex subset ofV* that is compact in the w*-topology. Then d^K / 0, and each 
point ofK lies in the w*-closure of the convex hull ofd^K. In other words, 

K = {co{deK))-. (B.165) 

Zorn’s Lemma will be used twice in the proof: both directly and through Theorem 
B.43, which relies on the Hahn-Banach Theorem B.40, whose proof uses Zorn. 
Furthermore, a face of a convex set fG is a nonempty convex subset F CK such that: 

If Z = tx -\- { \ — t)y for z G F with f G (0,1) andx,y G K, then x,y G F. 

In particular, each extreme point x G dgK is a face in its own right; conversely, a face 
consisting of a single point lies in dgK (as should be clear from the definitions). 

Proof. 1. Let I^{K) be the set of all closed faces in K, partially ordered by in¬ 
verse inclusion, i.e., F) ^ // iff // C F). The intersection of any finite subset of 
a totally ordered subset {F’;^} of is obviously nonempty, so that, by com¬ 
pactness of K, we also have 0- (Proof by contradiction: if then 

iIxPx ~ (t^/l/j,)^ = so that {F’/} is an open cover of K, which definition 

of compactness has a finite subcover {F’/,}. By the same argument, lix'Fx' = ®-) 
Hence fix^X is an upper bound of {F’;^}, so that Zom gives us a (not necessarily 
unique) maximal element F’o in (which set-theoretically is minimal be¬ 

cause of the reverse ordering, i.e., F’o contains no strictly smaller closed face). 
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2. We now show that Fq must be a singleton (and hence an extreme point of K). 

For any v GV, the function v : V* —> M defined by v{(p) = (p{v) is w*-continuous, 
see Propositions B.44 and B.46. Since Fq C K is compact, v assumes a minimum 
on Fq, say m. The set 

Fm = {(p G Fo\v{(p) =m} (B.166) 

is not only closed (by continuity of v), and hence compact (since F is), but it is 
again a face in K: first, if <p GF^ takes the form 

^ = f^i + (l-f)^, (B.167) 


with (pi,(p2 G Fq, then 


v{(p)=m = tv{(pi) + {l-t)v{(p2), (B.168) 

which, given that v{(pi) > m, is only possible if v(^i) = v(^) = m, so that (pi G 
Fm- Hence F^ is a face in Fq, but this implies that it is equally well a face in K. 
Namely, if (B.167) holds for (p G F^ and (pi G K, then regarding (p as an element 
of Fq gives (pi G Fq, because Fq is a face in K, upon which the previous step, 
where we regard (p as an element of Fm, gives (pi G F^. 

Since Fq is maximal, we must have F^ = Fq, so that each functional v is constant 
on Fq. Now we know (even without the Hahn-Banach Theorem) that the func¬ 
tionals V separate points in V*, since the very statement that (pi ^ (p 2 means that 
there is some v GV such that <pi (v) ^ (Piiv) and hence v(^i) v(^). So if Fq 

contains more than one point, there must be a functional v that is not constant on 
Fq. Hence Fq is a singleton, and therefore an element of d^K. That is, dgK ^ 0. 

3. The same argument applies to any closed face F in K, showing that each F G 
JF (K) contains at least one point in dgF. But such a point is a face in F and 
hence in K, and being a one-point face in K, it must lie in deK. So we may 
strengthen the previous point by concluding that F fl d^K ^ 0 for any closed face 
FCK. 

4. To prove (B.165) by reductio ad absurdum, define 

B = {co{deK))-, (B.169) 

and assume B ^ K. First note that co{deK) is convex by construction, and that 
its closure B remains convex (because the vector space operations, and a fortiori 
the convex sums, are continuous). Its complement in V* is open, and hence any 
point a G K\B has an open convex neighbourhood A C V*\B (see below), which 
is therefore disjoint from B. Hence Theorem B.43 applies (with V V* and 
(p v), giving us V G y and f G K for which v(a) <t< v(j3) for any j3 G B. 
Now define s = min{v(^) | (p G K}, which exists since K is w*-compact and v is 
w* continuous. Since a G K\B C K and v(a) < t, we have s <t. Subsequently, 
define Fg = {(p G K \ v{(p) = s}. As in step 2 above, it follows that Fg is a closed 
face in K. According to step 3, there is a point co G FgD deK, so that v((») = s. 
This contradicts s <t < v(j3) for any j3 G B, as O) G dgK C B. □ 
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The existence of A in step 4 above arises from the fact that open sets of the form 
= {<P G VM(p(v,)| < e (i = 1,...,«)}, (B.170) 

where e > 0 and all v, G V, form a basis of w*-neighbourhoods of 0 G V*, and hence 
its translates O) + form such a basis for any co GV*; the point is that such 

sets are convex, because if |^,(v)| < e for i = 1,2 and t G (0,1), then 

|(f^i-f (l-f)^)(v)| <t\(pi{v)\ + {l-t)\(p 2 {v)\ < {t + l-t)e = e. (B.171) 

Although the Krein-Milman Theorem is of considerable interest and beauty in 
itself, our main use of it lies in a few corollaries. Among those is Choquet’s Theorem 
in the next section, but we first turn to the Stone-Weierstrass Theorem: 

Theorem B.51. Let X be a compact Hausdorjf space. Let B be an involutive subal¬ 
gebra of C{X) (regarded as a commutative C*-algebra) that separates points on X 
(i.e., ifx y there is f G B such that f{x) f(y)) and contains the unit function lx- 
Then B is dense in C{X) in the sup-norm. In particular, ifB is closed, then B — C{X). 

In other words, B is a linear subspace of C{X) such that if f,g G B, then fg G B, 
and if f G B, then f* G B, where f* (x) = f{x). Furthermore, C{X) and hence B are 
equipped with the sup-norm. The assumptions could even be weakened; instead of 
asking that lx G B and that B separate points, for the proof we just need that for 
each x,y G X and s,t G R there is f G B such that f(x) = s and f(y) = t. 

We are going to derive Theorem B.51 from Theorem B.50 and the following: 

Lemma B.52. Let B be a linear subspace of some Banach space V. Then B is dense 
in V iff the only element (p GV* that satisfies (p{v) = Ofor ally G B is (p = 0. 

Proof. The “=J>” direction (which will not be needed) is immediate from the fact that 
^ G F* is bounded and therefore, if v = limv;i, for (v^) in B, then ^(v) = lim^(v;L), 
so that ^(v) =0 for all v G B implies ^(v) =0 for all v G V and hence (p =0. 

Conversely, if B f V, we will exhibit some nonzero ^ GV* with (p^g = 0. Take 
some w ^ B^ and define W <gV hyW = C- w-|-B^, along with a map (pw ; W —>■ C 
given by <pw(Xw -f v) = A for any A G C and v G B^. This map is trivially linear, 
as well as bounded: since w ^ B^ we have ||w — v|| > d for some d > 0, for each 
V G B ■, since then also —v G B , we have ||Aw-|-v|| > \X\d, and therefore 

\(pw{Xw-\-v)\ = |A| < (i^^||Aw-l-v||. 

By Corollary B.41, our (pw extends to some (p G V*, with (p^g = = 0. □ 

Proof. We now prove Theorem B.51. We define a subspace B^ GlM(X)hy 

B^ = {PG M{X) I Pin = JI(f), IImII < l,p{f) = OV/ G B}, (B.172) 

where f* (x) = /(x) as usual. Our aim is to show that 

B^ = {0}. (B.173) 
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Since any (p G M{X) is a multiple of some fj. in the unit ball ||/f || < 1, eq. (B.173) 
gives the antecedent of the part of Lemma B.52, which gives Theorem B.51. 

Noting that the w*-topology in M{X) is just the topology in which ^ jx 
iff Hxif) —>■ pi if) for each / G C(X), we see that is closed in the unit ball of 
M{X), so that it is w*-compact by the Banach-Alaoglu Theorem. Furthermore, 
is convex, so the Krein-Milman Theorem gives dgB^ ^ 0. Any jj. G dgB'^ has either 
WbW = 0, in which case (B.173) holds and we are ready, or, as we assume in what 
follows, 

||Al|l = l. (B.174) 

Indeed, if 0 < ||/r|| < 1, then 

pi=tpii + {l-t)pi2, (B.175) 

with f = ||/r||,/ri=/r/||/r||, and pi 2 = 0 would give a nontrivial decomposition of pi. 
For g gC(X), define 


Lg:M{X)M{X)\ (B.176) 

Lgpi{f) = pL{gf), (B.177) 

or “Lgdpi = g ■ dpi”. It follows from the assumptions on B in Theorem B.51 that if 
0 < g < lx and g G 5 (as we will now assume), then Lg maps into itself, and also 
0 < lx — .g < lx- Hence Li^^g maps B^ into itself. Given (B.174), we then have 

\\L,g^gPl\\ = l-\\Lgpi\\. (B.178) 


This follows from (B.76): the Hahn-Jordan decomposition (B.55) of pi also gives 
{Lgpi)± = Lgpi± and {Lig_gpi)± = Lig_gpi± (since g > 0 and lx — g > 0), so that 


=Lig^sPi+{X)+Lig^gpi^iX) (B.179) 

= pi+iX)+pi^iX)-Lgpi+{X)-Lgpi+{X) = llAlll - \\Lgpi\\. (B.180) 

Because of (B.178), we obtain a convex decomposition (B.175) with t = ||Lg/r||, 
pii = Lgpi/\\Lgpi\\, and pi 2 = Lig^gpi/\\Lig^gpi\\, which are well defined because 
of (B.174), which guarantees that the two denominators are nonzero. Since pi is 
extreme by assumption (i.e., it lies in dgB^), it must be that 




mx-gpi\\ 


(B.181) 


Hence g{x) = ||Lgfl|| almost everywhere with respect to /r; in particular, this must 
hold for each x G supp(/r). Suppose there are at least two different points x,y G 
supp(/r). Since B separates points and contains lx, we can easily find 0 < g < lx 
such that g{x) ^ g(y), contradicting constancy of g on supp(/r). So supp(/r) = {x}, 
which, given (B.174), implies that pi = ±5x, so that ft (lx) = ±1. Since lx G B, this 
contradicts (B.172). Hence (B.174) leads to a contradiction, and we are left with the 
other possibility ||ju|| = 0. This gives pi =0, that is, (B.173). □ 
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B.ll Choquet’s Theorem 

Choquet’s Theorem B.53 beautifully follows up on the Krein-Milman Theorem. 
To state it, we need the support supp(ju) of a measure ju on a space X, defined 
as the smallest closed set F such that pL{X\F) = 0, or, equivalently, as the largest 
closed set F such that each open neighbourhood U of each x G F has strictly pos¬ 
itive measure >0, provided such a set exists. This is the case, for example, 
if X is locally compact Hausdorff and p is (inner) regular. To see this, let {Ui} 
be set of all open Ux G &(X) such that p{Ux) = 0, and let t/ = By inner 

regularity, p{U) = sup{/r(/r) | K C U,K G JC(^)}. Since each such K is compact, 
X C U”^jt/;Lp whence< I^iP(CxJ =0. Hence p(U)=0, and supp(/r) =X\U. 

Theorem B.53. In the notation of Theorem B.50, for each (p G K there is a proba¬ 
bility measure p on K whose support is contained in such that for each v GV, 

(p{v)= f dp{co)o){v). (B.182) 

JdeK- 

Moreover, ifK is metrizable, then the support of p may be restricted to dgK. 

Here dgK^ = (dgK)^ is the closure of dgK', in many examples (e.g., state spaces of 
C*-algebras of infinite quantum systems), dgK is not closed or even Borel. 

Reading (B.182) from right to left, the point (p G K is called the barycenter of p. 
Preparing for the proof, we note that if 2f is a compact Hausdorff space, the dual 
C{X)* of C(2f) as a Banach space (in the sup-norm) is the space M{X) of all com¬ 
plete regular complex measures p on X-, cf. Theorem B.24. The set (2f) of all 
complete regular probability measures on 2f is a closed subset of the unit ball of 
M{X), since \\p\\ = p{X) = 1 if /r G M^{X), cf. (B.54), and hence M^{X) is w*- 
compact by the Banach-Alaoglu Theorem. We will use these facts with X = . 

We also recall that a (not necessarily continuous) function f : K ^'Ris affine if 

f{t(pi + (1 - t)(p2) = tfi(pi) + (1 - t)fi(p2), (B.183) 

for t G (0,1) and (pi f (P 2 G K, concave if one has > instead of = in (B.183), convex 
with < instead of =, andstrictly convex if (B.183) holds with =~^<. 

For example, f{x) = x^ is strictly convex on [—1,1]. The assumption of metriz- 
ability will only be used to prove the existence of a strictly convex continuous func¬ 
tion on K, so this existence could have been assumed instead of metrizability. Fi¬ 
nally, we denote the space of real-valued continuous affine functions on K by A(fir). 

Proof By Theorem B.50, (p = lim^;^, where {(px) is some net in co{deK), so 
that (px = where the sum is finite, > 0, and = 1- Then 

Px = Di Pi ^ is ^ probability measure on dgK and hence also on its (compact) 

closure dgK^. Since M^{deK^) is w*-compact, the previous net has a subnet that 
w*-converges to some p G Mj^{dgK^). Noting that ^(v) = ^(v), where v G V** is 
w*-continuous by Proposition B.46, this p by construction satisfies (B.182). 
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We now prove the last claim. If K is metrizable, then C{K) is separable, so that 
its subspace A{K) is separable, too. Thus we can find some countable dense subset 
{fn)n>o of A(fir), in terms of which we define a function /o : /T —> K by 

CXD 

M(p) = ^ 2-«(||/„|U + 1)-2|/„((p)|2. (B.184) 

H=1 

First, continuity of /o follows from uniform convergence of this series and continu¬ 
ity of each /„; recall that A(fif) C C(/r,]R). Second, the example just given implies 
that if / G A{K), then is convex, and it is even strictly convex provided there is 
at least one n > 0 for which /n(^i) 7 ^ fnitpi)- To show that this is the case, we note 
that since V C V** separates points in V* and each v G V** defines an element of 
A{K) by restriction, A{K) separates points in K. Therefore, by density of the family 
(/„), the claim follows, and /o is strictly convex. This will be crucial. 

For each real-valued / G C(fG,]R), define the concave envelope f by 

/((p) = infU((p) I g G A{K),g> /}. (B.185) 

The terminology comes from the fact that / < / for any / G C{K), with equality if 
/ is concave; this is because for any continuous concave function / we may write 

/((p) = inf{g((p) I h G A{K),g > /}. (B.186) 

In terms of this, for any fixed element (po G K we define p : C(fG,]R) —>■ M by 

p{f)=fi(po)- (B.187) 

Since f + g < / + | and f/ = f/ for t > 0 , as is easily verified, it follows that p is 
sublinear (cf. Definition B.38). We define a linear subspace W C C(fG,K) by 

W =A{K)+R-fo, (B.188) 

endowed with the ‘hatted’ evaluation map : IT —7 R defined by 

^<f>oig + sfo)=g{(Po)+sfo{(poy, (B.189) 

since g = | for any g G A(K), for s > 0 we have ev^(g + s/o) = ev^(g -f s/q). 

It is easy to show that p dominates so that the Hahn-Banach Theorem 
B.40 yields an extension of to C(fG,R) that satisfies < /{(po). 

This implies that ev'^p^ is positive; to see this, take / < 0. Since the zero function 
is in A{K) we have / < 0 also, so that < 0. Passing to —/, we find that 

if) > 0 whenever / > 0. Furthermore, since Ik G A{K) C W, we have 

= ^Ki(Po) = 1- 
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Therefore, is a state on C{K). Corollary B. 17 then turns into a probability 

measure /r on K. Taking / = v for some v G V, we have / G A{K) C W, so that 

/ did{co)co{v)= / dnv = n{v)=ev^(v)=^(pQ) = (po{v). (B.190) 

This is almost (B.182) with <p <Pq; what we still need to prove is the property 

supp(/r) C deK. (B.191) 


This will be proved in two steps. For any / G C(K), we define K(f} C Khy 


^(/) = {<PG^I/(<?)=/(<?)}• 

(B.192) 

We will separately show that 


supp(^) C K{f<Y)\ 

(B.193) 

K{fo) c deK. 

(B.194) 

Towards (B.193) we start showing that 


At(/o) = Ai(/o), 

(B.195) 


which is a conjunction of fl(/o) < ft(/o) and fl(/o) > fl(/o)- The first is true for any 
/ G C{K), since /r is positive and f < f (pointwise). The second is specific to /q: 

At(/o) = ev'^(/o) = w^(/o) = M(Pq) 

= inf{g(^o) \ g&A{K),g>fo} 

= mf{^{8)\gGAiK),g>fo}, (B.196) 

since for g G A{K) we have g{(po) = IJ.{g) because A{K) C W. If in addition g > /q, 
we have g > /o, which implies fl(g) > fl(/o). This inequality survives the infimum 
in (B.196), so that we finally obtain /r(/o) > fl(/o), and hence (B.195). 

We now prove (B.193) from (B.195). Since /o < /o, for each n > 0 we may define 

K„ = {(pGK\ fo{(p)-fo{(p) > 1/n}. (B.197) 

Then 0 < < n • (/o —/o), which vanishes by (B.195). Hence ^{K„) = 0 

for each n, and therefore = 0. But = K{foY, so (B.193) follows. 

Eq. (B.194) is equivalent to the inclusion {d^KY C K{foY, i.e., the implication; 

if ^ = t(pi + (1 —t)(p 2 for some t G (0,1) and (pi Y ^ 2 , then fo{(p) Y /o(^)- 

Indeed, strict convexity of /o (used at last!) and the familiar property /o < /o give 

hW) = infte(^i) + (l-0^(^2) kG^(^),^>/o} 

> finf{g(^i) |gGA(/:),g>/o} + (l-f)inf{g(^ 2 ) \ g A{K),g > fo} 

= tfo{(Pl) + (1 - t)foi(p2) > tfoi(pl) + (1 - t)fo{(p2) > M(P). □ 
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In turn, the existence of some measure fj. in (B.182) representing an arbitrary 
point (p G K implies the Krein-Milman Theorem. We rewrite (B.182) as 

H(p)= f dllv, (B.198) 

JdgK- 

where (p G K is arbitrary and v G C{K) is the (affine) continuous function on K C 
V* induced by the functional v G V** on V* defined by v G V under the canonical 
injection V V**, v i-G v, see Proposition B.44. From (B.198) and (B.34) we obtain 

which, because dgK^ C {co{deK))^, also gives the inequality 

This forces (p G (co(deK))^, for if ^ ^ (co((9gfir))^ we would obtain a contradiction 
with Theorem B.43 (which is a version of the Hanh-Banach Theorem), or more 
precisely, with the alternative version thereof stated after its proof, with A = {(p} 
closed and B = {co(deK))^ compact and convex (and, of course, (p —(p). There¬ 
fore, K C (co{dgK))^, which implies (B.165). 

If only to illustrate Choquet’s Theorem, we note that existence of the probability 
measure /r in the Riesz Representation Theorem B.15 follows from it. To see this, 
fix some compact Hausdorff space X, and take V = C(2f ,]R) (as a real Banach space 
in the supremum-norm) and K — 5(C(2f,]R)) C V*, i.e., the set of positive linear 
functionals (p : C(2f,]R) —^ K that satisfy ^(Ix) = 1- By the argument following 
Definition 1.14, /f coincides with the state space 5(C(2f)) of the commutative C*- 
algebra C{X), which is a complex Banach space (cf. Appendix C), in that each (p GK 
extends uniquely to a state (p : C{X) ^ C hy complex linearity, which extension 
remains positive in the sense of Definition C.3. From Propositions C.14 and C.19, 
the map X -^V* given by x i—>■ eXx, where eXx{f) = f{x) is the evaluation map at x, 
takes values in d^K and yields a homeomorphism 

deK^X. (B.199) 

In particular, d^K is closed in V* (and in K), so (B.182) comes down to (B.39). 

The part of Theorem B.15 that does not follow from Theorem B.53 is the possible 
uniqueness of the measure p on dgK^ that represents the point <p GK. Uniqueness 
of the measure in Choquet’s Theorem is settled by the following notion. 

Definition B.54. A (Choquet) simplex is a compact convex set K cV* whose as¬ 
sociated convex cone K = K+ K = {tco \ t > 0,CO G K} (cf. Definition C.50) is a 
lattice in the partial ordering < defined by p<cy iff O — pGK 

Here we assume that for any p G K there is a unique t G and (O G K such that 
t(0 — p; this is the case if K — KDH for some closed hyperplane H in V* that does 
not contain the origin. For example, if K = S{A) is the state space of some unital 
C*-algebra A, then H = {(p G A* | (p{1a) = 1} and K — {(p gA* \ (p> 0}). 
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In finite dimension, Choquet simplices are special convex polytopes called sim- 
plices. Recall that the so-called regular polyhedra were classified (up to affine iso¬ 
morphism) by Schlafli in 1852, who showed that the only possibilities are: 

• The simplices An = {x G | x,- > = !},«>!; 

• The cubes = {x G K" | — 1 < x; < 1}, n > 1; 

• The cross-polytopes (9« = {x G R" | 1^(1 < 1}, n > 1; 

• The countably many regular polygons in (which include Qi, 02 ,^ 2 )', 

• The five Platonic solids in R^ (which include ^ 3 , 03 , ^ 3 ); 

• The six regularpolychora in R^ (which include 

An n-dimensional simplex is affinely homeomorphic to the convex hull of n -f 1 lin¬ 
early independent points (or, equivalently, \deK\ = n -|-1). In particular, the simplex 
A„ is the set Pr( « + 1 ) of all probability distributions on a set A = w -|- 1 of cardi¬ 
nality n -I- 1, cf. Definition 1.9. Generalizing this idea, if A is a compact Hausdorff 
space, then the state space S{C{X)) of the associated commutative C*-algebra C(A), 
which as we know consists of all probability measures on A, is a Choquet simplex. 
In the notation of Theorem B.53, the simplest result (again due to Choquet) is: 

Theorem B.55. Suppose K is metrizable, and assume supp(/r) C dgK in (B.182). 
Then p is uniquely determined by its barycenter (p ijfK is a Choquet simplex. 

However, we note that without any assumption on K, conversely the barycenter (p 
for which (B.182) holds for all v G V is uniquely determined by p. This observation 
gives rise to a map B from the compact convex set M{K)^ of all probability mea¬ 
sures on A to A itself, such that B{p) is the unique point in A such that (B.198) with 
(p = B{p) holds for all v G V. This map B is, in fact, affine as well as continuous. 

Theorem B.55 covers finite phase spaces in classical mechanics as well as, 
negatively, finite-dimensional Hilbert spaces in quantum mechanics: in the for¬ 
mer case, any state admits a unique decomposition into pure states (cf. Proposi¬ 
tion 1.13), whereas in the latter this fails. For example, for H = C^, the state space 
S{B{H) ) = B^ (see Proposition 2.9) is not a simplex. See also Proposition 2.14. 

To explain the general (i.e., non-metrizable) case, we first define the Choquet or¬ 
dering -< on the set of probability measures on A by /r ^ V iff p{f) < v(/) for any 
convex function / G C(A,R). Noting that B{p) — B{v) whenever p ^ V, the idea 
is that since the values of convex functions almost by definition increase towards 
the boundary d^K, probability measures on A with given barycenter that are maxi¬ 
mal with respect to ^ should be supported on dgK (such maximal measures always 
exist by a Zorn’s Lemma argument). This intuition is indeed correct, provided A 
is metrizable, in which case, conversely, the condition supp(/r) C dgK in Theorem 
B.55 forces p to be maximal. In general, an alternative way to prove the first part of 
Theorem B.53 would be to take some maximal p with given barycenter p. 

The key to the generalization of Theorem B.55 to the possibly non-metrizable 
case, then, is to replace the assumption supp(/r) C dgK by maximality of p. This is 
achieved by the major Choquet-Meyer Theorem, which we state without proof: 

Theorem B.56. Assume the measure p in (B.182) is maximal with respect to -<. 
Then p is uniquely determined by its barycenter (p ijfK is a Choquet simplex. 
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B.12 A precis of infinite-dimensional Hilbert space 

The main difference between infinite-dimensional Hilbert spaces and their finite¬ 
dimensional counterparts lies in issues of convergence and completeness. Every 
linear subspace of a finite-dimensional Hilbert space is automatically complete (cf. 
Proposition B.5), and all sums one encounters are finite. In infinite dimension, fc(N) 
is a linear but incomplete subspace of f^(N), and similarly for Ce(K) C L^(K); the 
expansion of some vector in terms of a basis already involves an infinite sum. 

Note that in metric spaces a subset is closed iff it is sequentially complete (in 
that it contains all limits of Cauchy sequences); this can be seen from the fact that 
the metric topology is generated by e-balls and hence by (l/n)-balls, n G N. Con¬ 
sequently, in Banach spaces (and hence in Hilbert spaces) H, the property of some 
subspace L C H being (metrically) complete (in the sense that every Cauchy se¬ 
quence in L converges to an element of L) is the same as L being (topologically) 
closed (in the sense that the set-theoretic complement is open). Following tradi¬ 
tion in functional analysis, we will henceforth speak of closed subspaces. We denote 
the (metric or topological) closure of 5 C // in // by 5^. 

An exhaustive way of guaranteeing that some linear subspace L C // is closed is 
to exhibit it as an orthogonal complement L = S^, where S C H is any subset: we 
write y/ _L 5 iff (x, yr) = 0 for each X G 5, and, as in (A.29), put 

S-^ = {\I/GH \\j/±S}. (B.200) 

We also use the double orthogonal complement 5^^ = (5^)^, et cetera. 
Proposition B.57. Let H be a Hilbert space. 

1. IfS C H is any subset, 5^ is a closed linear subspace ofH. 

2. For each closed linear subspace L <zH, one has 

H = L®L^, (B.201) 

in the sense that 

LnL^={0}, (B.202) 

and each vector \j/ G H has a unique decomposition 

y/= y/tl-l-y/-'-, (B.203) 

where y/ll G L and G 

3. For any closed linear subspace L one has = L. 

4. For any linear subspaces L, one has 

L^^=L-, (B.204) 

and hence L = H ijf (p) — Ofor each (p G L implies \j/ = 0. 

5. For any subset S <Z H, one has = 5^. 

6. For any subset S G H, the closure \S\^ of the (finite) linear span [5] ofS is 5^^. 
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Proof. 1. Linearity of follows from linearity of the inner product. If Xj/n € 
and xj/n -A xj/, then for X G 5 and each n, we have 

\{x,v)\ = \{X,V-Vn)\< \\x\\\\V-Vn\\- (B.205) 

Taking oo gives {x,W) =0 ^nd hence xj/ G S^, so that 5^ is closed. 

2. The proof of the infinite-dimensional case (cf. Corollary A.9 for finite dimension) 
relies on Riesz Lemma B.58 below, which explains why L needs to be closed, and 
also neatly identifies Xj/W as the unique vector in L at minimal distance to y/. 
Granting this important lemma, let xj/ G H, we take 

C = xj/ + L = {xj/+(p,(p G L}. (B.206) 

Lemma B.58 yields a unique vector Xo G C, from which we define y/il = y/ — Xo 
and xj/^ = Xo (so that || y/H — v/|| = ||xo|| is minimal). Then y/H G L, and (B.203) 
holds by construction. To show that Xo G L^, we rewrite the inequality ||xo|| < 
llVf+ ^|| (for all ^ G L) as ||xo|| < ||Xo + ^l|, since x^r = Xo + and vf'' G L. 
Putting ^ = —((CjZo)/||CIP)C> with C G L arbitrary (but nonzero), the last in¬ 
equality reads 0 < -KC,Zo)P/||ClP. whence (C,Xo) = 0 for all C G L, so that 
Xo G L^. Uniqueness of the decomposition (B.203) follows as in Corollary A.9. 

3. Trivially, . To prove the converse inclusion, use the previous item. 

4. If A C B, then C A^ and hence A^^ C B^^. With A = L and B ~ L , this 
gives L-^-^ C (L^(where, L being closed, we used the previous item). 
Conversely, L C and hence L C since is closed by the first item. 

5. Take L = and use the third item. 

6 . Proceeding as in the proof of no. 1, from the continuity of the inner product we 

find 5^ = ([5]^)^, and hence, using no. 3, finally 5^-^- = ([5]^)^^ = [5]^. □ 

Lemma B.58. The norm assumes a unique minimum on any closed convex set C C 
H (i.e., there is a unique Xo € C such that ||Xoll < llxll for each X G C, X Xo)- 

Proof. Let /x = inf{||x||,X G C}, which exists, as ||x|| > 0. Hence there is a mini¬ 
mizing sequence (Xn) in C with ||x«|| —5" ft, which we now prove to be Cauchy (in 
H). Since C is convex, ^{Xn+Xm) G C, and therefore, ||x« + Xmll ^ 2/2. Thus 

0<\\Xn-Xmf = 2{\\x„f + \\Xmf)-\\Xn+Xmf<2{\\Xnf + \\Xmf)-^B^, 

and since 2(||Xn|P + lIXmlP) 4/2^ as n,m -G °o, we must have \\Xn - Xm\\ 0. 
Since C is closed, x« —Xo for some Xo G C. To prove uniqueness, let another mini¬ 
mizing sequence (Xn) converge to Xq G C. Then j(Xo + Xo) ^ ®o we obtain 

||Xo + Xoll> 2/1 = 11X011 +lIXoll- 

The inequality ||Xo + Xoll < lIXoll + lIXoll gives ||Xo + Xoll = lIXoll + lIXoll^ i-e- 
Re(Xo,Xo) = lIXolllIXoll-Cauchy-Schwarz gives |(Xo,Xo)| < lIXolllIXoll with equal¬ 
ity iff Xo ond Xo ore proportional, so the previous equality can hold only if Xq = tXo 
for some f > 0. Since Xo ond Xo both minimize the norm, we have f = 1. □ 
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We now turn to the important concept of a basis of a Hilbert space; as in the 
previous appendix, a basis of a Hilbert space always denotes an orthonormal basis. 
To define this notion, we first say that some subset {t;,},G/ of H is orthonormal if 

{Vi,Vj) = 5if, (B.207) 

this condition guarantees that the Vi are linearly independent (and easy to calculate 
with!). Second, in finite dimension (where I must be finite) we may simply define 
a basis of H as an orthonormal set that is also a basis in the usual (linear algebra) 
sense. This idea remains valid for general Hilbert spaces, except that we should use 
Definition B.6 to define infinite sums (and Lemma B.7 to analyze them). Theorem 
B.61 to come gives an exhaustive account of the situation, but we first need a lemma 
on general orthonormal sets (that do not necessarily form a basis). 

Lemma B.59. If {Vi}iei is an orthonormal set in H and Ci G C, then the sum 

\l/ = Y^CiVi (B.208) 

iei 

converges in H (in the sense of Definition B.6} iff 

(B.209) 

iei 

If this is the case, the coefficients Cf € C are given by 

Ci = (Vi,\fr}. (B.210) 

Proof The first claim follows from Proposition B.8 and the elementary computation 

IIE ^'"^'ii^ = E ii^'-^'ii^ = E i^'f < (B.211) 

ieC ieG' ieG' 

where G' is finite, so that the sums and L/G/k'P either both exist (i.e., 

converge) or both do not exist. When I is countable this follows more simply by 
noting that LigN<^iV( converges iff (s„) is a Cauchy sequence, where = Y!i=i ^iVi, 
and computing Ikn Smlk — L,'=m+1 Icp, where n > m. To prove (B.210) on the 
assumption that (B.208) exists, by the Cauchy-Schwarz inequality, for any e > 0, 

\{Vj,\j/}-Cj\ = |(t;,-,v^-Ec,-B,' + EQU,')-Cil 
ieG ieG 

= ^ ll^fllllv^- E^'^'-ll < 

ieG ieG 

where we used Definition B.6 as well as ||b,|| = 1. Letting e —> 0 yields (B.210). □ 

Lemma B.60. Let {u,};g/ be an orthonormal set in H. We have Bessel’s Inequality 

L\{Vi,w)\^<\MUv(^H). (B.212) 

iei 
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Proof. For any finite G computation based on (A.2) yields 

^< Wxiff. (B.213) 

i'GG ieG 

It follows that also the supremum of the left-hand side over all finite subsets G C I 
is bounded by ||vf|P and hence is finite. By Lemma B.7, this supremum equals 

Liei I {i^i, W) P. which gives (B.212). □ 

Theorem B.61. Let B = {r),}(G/ be an orthonormal subset of a Hilbert space H. The 
following conditions are equivalent (and each defines B to be a basis ofH): 

1. Any \j/ G H can be written (in the sense of Definition B.6) as \j/ = 

2. For each \j/ G H, one has Parseval’s equality 

Y,\{Vi,w)\^ = \\wf. (B.214) 

iei 

3. For any ij/^ip G H one has 

(<P,¥} (B.215) 

iei 

4. B is not properly contained in any other orthonormal set (i.e., B is maximal). 

5. B^ = {0}. 

6. B^^ = H. 

7. The closure of the linear span ofB is H. 

Note that (B.215) is used in almost every computation in quantum physics, in which 
one also typically has ||v/|| = 1. In that case, (B.214) at least formally turns the 
|c,p = |(Ui, V/)p into (Born) probabilities, as discussed throughout the main text. 

Proof Assuming (B.208) and hence (B.210), take e > 0 and find F CX (finite) so 
that By (B.213), this gives 

^ |(u,-,v/)|^-||rf I <e^ (B.216) 

ieG 

Hence (B.214) holds in the sense of Definition B.6 (with V = C). Conversely, as¬ 
suming (B.214), eq. (B.213) gives (B.208). This proves the equivalence 1 gg2. 

Clearly, (B.214) is a special case of (B.215), which in turn follows from (B.208) 
with (B.210) and continuity of the inner product, whence 3^2 and 1 —3. 

Furthermore, 1 —> 5 follows by contradiction; given (B.210), any nonzero vector 
xj/ G B^ could not possibly be written as (B.208). Conversely, 5 —1 most easily 
follows by contradiction, too. For any xj/ G H, the sum (p — ¥)'^i exists in H 

by Lemma B.59. Continuity of the inner product yields (vj, <p) = (Vj, xj/) and hence 
{Vj, (p — Xj/) =0 for each j G /, whence cp — Xj/ G B^. If (p cannot be written in the 
form (B.208) we have ^ f= Xf/, so B^ f {0}, which is the desired contradiction. 

Finally, 4 5 is tautological, 5 O 6 is trivial, and 6 O 7 is a special case of 

Proposition B.57.6 (hence this proposition is needed only for no. 7). □ 
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For example, if H = then one may take I = S, with Vx = dx- Since S is an 

arbitrary set, this example shows that any cardinality of I may, in principle, occur. 
The existence of a basis has a remarkable consequence, for which we need; 

Definition B.62. 7vvo Hilbert spaces Hi and H 2 are called isomorphic, written 
H\ = H 2 , if they are isometrically isomorphic, that is, if there is an invertible linear 
map u G B{Hi,H 2 ) such that 

II«’/IIh2 = IIV^llffi iW^Hi). (B.217) 

By Theorem A.3, a specific surjective isometry u : Hi ^ H 2 implementing an iso¬ 
morphism is automatically unitary, in that it is suijective and satisfies 

{u\i/,u(p)h2 = {V,(P)hi- (B.218) 

Conversely, a unitary map is an isometric isomorphism, so that isometric isomor¬ 
phism of Hilbert spaces (seen as Banach spaces) is the same as unitary isomorphism. 
The following theorem (due to von Neumann, who was a specialist in both Hilbert 
space theory and axiomatic set theory) shows that the classification of Hilbert spaces 
up to isomorphism reduces to the classification of sets up to bijection. 

Theorem B.63. 1. Any Hilbert space has a basis. 

2. All bases of a given Hilbert space H have the same cardinality (which is then, 
consistently, called the dimension ofH). 

3. Two Hilbert spaces are isomorphic iff they have the same dimension. 

Specifically, clause 2 states that if (b,),g/ and {v'j)jej are both bases of H, then I = J 
as sets (i.e., there is a bijection I J). Similarly, clause 3 states that Hi = H 2 iff Hi 
has a basis (Vi)iei and H2 has a basis (Vj)jej for which I = J. 

Proof. 1. The general proof is, alas, based on Zorn’s Lemma; the collection O of 
all orthonormal sets in H is ordered by inclusion and each totally (i.e. linearly) 
ordered subset has an upper bound, namely its union. Hence O has a maximal 
element, which is a basis by Theorem B.61.4. Fortunately, in case that H is 
(topologically) separable (in that it contains a countable dense subset), a ba¬ 
sis may be constructed by the well-known Gram-Schmidt procedure, as fol¬ 
lows; let (y/i, v/2, • ■ ■) be a countable subset of H, for simplicity already taken to 
be linearly independent (otherwise, remove linear combinations first), start with 
Bi = inductively define w„ = ~ n G N, which 

already yields an orthogonal set, and finally normalize to i)„ = w„/||w„||. 

2. We only prove the case where one basis, say {B/},(=/, is finite in somde detail. 
Take another basis From (B.214) and (B.215), 

1^1=Ell^'ii^=EE = E ll^fll^ = l-^l- 

iei iei jeJ iei jeJ jeJ 

A similar computation excludes the possibility that I is countable and J is not. 
The general case relies on some cardinal arithmetic, which we spare the reader. 
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3. Let {Vi}iei be a basis of H and let be a basis of H'. Assume / = 7, so 

that there is a bijection b : I J. Define u : H ^ H' and v ; ^ by linear 

extension of uVi — and vu' = Vi^-iyy that is, 

uW = 

iei jeJ 

jeJ iei 

where in each line the first equality sign is the definition of the map, whilst the 
second is a useful rewriting. These maps are well defined by Lemma B.59, e.g., 

E I > </) 1^ = E K'^o V^) 1^ = II V^ll^ (B.221) 

jeJ iei 

so that the sums in (B.219) converges, and likewise for (B.220). Furthermore, 
{u\i/,u(py = = Y,{\j/,Vi){Vi,(p) = {\i/,(p}, 

il,‘2 

where we used (B.207) for the primed basis, and (B.215). Similar computations 
establish so that (in view of their obvious surjectivity) u 

and V are both unitary, as well as uv = and vu = 1^. Thus H = H'. 
Conversely, if H (with basis {u,},g/) and H' are isomorphic, so that there is a 
unitary u: H ^ H\ then {uX)i]iei is a basis of H', hence J even equals I. □ 

Corollary B.64. If { Vi}iei 0 basis ofH, then H = £^{I). 

Proof. Define u: H ^ £^{I) by linear extension of uVi = 5,, where i G I. □ 

Corollary B.65. A Hilbert space is (topologically) separable iff it either has a 
countable basis, or is finite-dimensional. 

Proof. One direction of the proof is the Gram-Schmidt procedure (since the given 
countable dense set contains a basis). Conversely, if {u,} is a countable (or finite) 
basis of H, then the complex rational linear span of this set, i.e., the set of all finite 
linear combinations Y.iCi'Oi with c,- G Q + iQ, is countable as well as dense in H. □ 

In particular, any finite-dimensional Hilbert space is isomorphic to C” with standard 
inner product, and any separable Hilbert space is isomorphic to f^(N); when speak¬ 
ing of a separable Hilbert spaces we actually tend to think of the infinite-dimensional 
case. Although at first sight separability appears to be a rather restrictive condition, 
in fact the non-separable case only appears in some weird proofs in the theory of 
operator algebras (as well as in the theory of almost continuous functions in the 
sense of H. Bohr). Indeed, every Hilbert space naturally occurring in applications to 
mathematical physics (or to partial differential equations) is separable. 


(B.219) 

(B.220) 
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B.13 Operators on infinite-dimensional Hilbert space 

The fact that all (infinite-dimensional) separable Hilbert spaces are isomorphic sug¬ 
gests that the riches of the theory are not be found in the spaces themselves, but 
in the operators that act on them (whose explicit form typically depends on some 
concrete realization of H, like ^^(N), or etc.). The simplest operators are 

functionals, i.e., linear maps f : H ^ C, and the main new feature compared to the 
finite-dimensional case is that / is no longer necessarily bounded, see §B.9. The 
nature of bounded linear functionals, i.e., elements of the dual H*, is totally settled 
by the Riesz-Frechet Theorem (which we already know; cf. Proposition A.5 and 
nos. 6 and 7 in Table B.l in §B.9), showing that little is gained by looking at them. 

Theorem B.66. Let H be a Hilbert space. The map \j/ r- fy/from H to H*, where 

= (B.222) 

is an isometric anti-linear isomorphism H —> H*. 

Proof. For convenience we rewrite (B.l24) for the case at hand as 

ll/ll = sup{|/(v/)|, r G H, Mh < 1}. (B.223) 

Since \f\i/{(p) \ = \ {\j/,(p) \ < ||</||||^|| by Cauchy-Schwarz, it follows that f^, G H* 

for any xj/ G H, with ||/^|| < || v/||. We may sharpen this to equality, i.e., 

Il/v/|l = llrll, (B.224) 

by choosing f = fxfi and (p = xj/ in (B.223). Hence xj/ fy, is isometric and there¬ 
fore also injective. To prove surjectivity, we find a vector xj/ for which some given 
nonzero functional / equals fy, (of / = 0, then xj/ = 0 does the job). Assume f fO 
(otherwise, xj/ = 0 does the job). Then ker(/)^ f {0}: namely, ker(/) is closed 
by continuity of / and is linear by linearity of /, whence ker(/)-'-'- = ker(/) by 
Proposition B.57.3, so that (arguing by contradiction) ker(/)^ = {0} would imply 
ker(/)-'-'- = H and hence ker(/) = //, or / = 0. 

The remainder of the proof is the same as for Proposition A.5. □ 

This allows one to make the weak topology on H (or, equivalently, the weak* 
topology on H*) explicit (cf. §B.9): we have \i/„ -G Xj/ weakly iff {(p, — yr) —0 

for each <p G H (and similarly for nets). From the general theory, or directly from 
Cauchy-Schwarz, it is immediate that (at least for infinite-dimensional H) the weak 
topology on H is indeed weaker than the strong one (that is, strong convergence im¬ 
plies weak convergence), but not the other way round. A simple example is provided 
by any ordered countable basis of a separable Hilbert space, where v„ 0 

weakly but not strongly for any n gN (more generally, for any infinite-dimensional 
Hilbert space and any basis {v,} we have i), —0 weakly but not strongly in the 
sense of convergence of nets). Nonetheless, as a corollary of Proposition B.46; 

Corollary B.67. The functional fy/ defined by (B.222) is weakly continuous. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



B. 13 Operators on infinite-dimensional Hilbert space 


569 


We now move from functionals als special operators from i/ to C to operators in 
the usual sense, i.e., linear maps from H to itself. Once again, the main new feature 
compared to the finite-dimensional case is that a linear map a ://—>// is no longer 
necessarily bounded, where (cf. Definition B.32) we recall that a is bounded if it 
satisfies one (and hence both) of the following equivalent conditions: 

||aV/|| <C||v/|| (re//); (B.225) 

sup{||flV/|l,rG//,||r|| < l}<°o. (B.226) 

In that case, the (finite) supremum is called the norm ||a|| of a, exactly as in (A.18). 
Using Theorem B.66 and (B.130), we therefore have 

||fl|| =sup{||av^|l,V/G//,||v^|| = l} (B.227) 

= sup{|(^,fli/)|,V/,^G//,||r|| = ll^ll = 1}, (B.228) 

and we have the inequalities (A.20) and (A.21), as in the finite-dimensional case. 

It is clear from (A.20) and (B.225) that bounded operators a are continuous, 
in that if i//, then > ay/. On the other hand, unbounded operators are 
discontinuous in this sense: for each n G N there is xj/n G H with ||V6i|| = 1 and 
\\aWn\\ > n. The sequence {\iffi = xj/njn) then converges to zero, but since ||flij/„|| > 1, 
the sequence {aiffn) does not converge to a • 0 = 0. Thus on infinite-dimensional 
Hilbert spaces a sharp distinction emerges between bounded and unbounded opera¬ 
tors. 

Among the former, we will distinguish between compact operators and the rest, 
whilst among the latter, one has the closed operators (i.e., those with a closed 
graph), which are still reasonably well-behaved, and the (non-closed) rest. Yet cut¬ 
ting through the bounded-unbounded divide is the notion of self-adjointness. For 
any linear (not necessarily bounded) map a : H ^ H, we say that a is self-adjoint if 

{a(p,\lf) = {(p,a\lf), {\lf,(pGH). (B.229) 

The remarkable Hellinger-Toeplitz Theorem then states that such maps are bounded: 

Theorem B.68. If a linear map a : H ^ H satisfies (B.229), then it is bounded. 

Proof. The proof is based on the Closed Graph Theorem B.37. If the sequence 
(\l/n,a\l/n) in G{a) C H(BH converges, say to {xp,(p) G H(BH, then \j/„ ^ \j/ and 
axj/n —^ (p. Using (B.229) and continuity of the inner product, for j G // we have 

{X,(p}=lim{x,a\i/„}=lim{aX,Wn} = {aX,W} = {X,aw}- 

n n 

For X — tp — axj/, this yields (p — axj/, and hence {xj/, <p) G G(a). This means that 
G(a) is closed, upon which the Closed Graph Theorem states that a is bounded. □ 

More generally, if V and W are Banach spaces, with dual spaces V* and W*, respec¬ 
tively, and two linear (but not a priori bounded) maps a:V and b : W* -G- V* 
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satisfy (p{av) = {b(p){v) for each v and (p € W*, then a and b are bounded, with 
b = a*, as defined in (B.125). The proof is similar. 

This generalization of Theorem B.68 also places the familiar adjoint a* from 
Hilbert space in broader perspective: making the identification o y/ of H* with H 
described by the Riesz-Frechet Theorem B.66, the Banach space definition (B.125) 
of the adjoint a* : H* ^ H* of a bounded linear map a : H H reproduces the 
definition (A. 15) of the Hilbert space adjoint a* : H ^ H. Thus we also infer that 
(B.128) is valid for arbitrary Hilbert spaces. Note that in the Hilbert space case, 
boundedness of a* may be proved more simply, as follows. 

Proposition B.69. Let a S B{H) and let a* : H H be its adjoint, that is, 

{a*\i/,(p) = {\j/,a(p) G H). (B.230) 

Then a* is bounded, with ||a*|| = ||a||. 

Proof. Eq. (B.230) gives |(a*V7,^)| < ||a|| ||i/H||^||. Taking ^ = a*!/yields ||a*V7|| < 
||fl|| I V/|, and hence ||a*|| < ||a||. Replacing a by a* gives the last claim. □. 

Since unbounded self-adjoint operators a ://->// do not exist, von Neumann 
defined such operators on some (proper) linear subspace D (a) C H {always assumed 
to be dense in H), called the domain of a. This affects the definition of the adjoint: 

Definition B.70. 1. The adjoint a* of an operator a:D{a) H has domain D{a *) C 

H consisting of all ij/ € H for which the functional : D{a) —>■ C, defined by 

f^{<P) = {W,a(p) {(pGD{a)), (B.231) 

is bounded, i.e., there is C > 0 such that |/^(^)| <C\\^>\\for all ^ £ D{a). 

2. For \j/ G D{a*), the functional has a unique bounded extension f^ : H C, 
so by Theorem B.66 there is a unique vector \j/' G H such that f^{(p) = {\j/,(p). 

3. The adjoint a* : D{a*) C H, then, is defined by a* \j/ — \j/', or, equivalently, by 

{a*\lf,(p) = {\lf,a(p), \lf G D{a*),(p G D{a). (B.232) 

Note that, on our assumption that D{a) be dense in FI, i.e., D(a)^ = FI, eq. 
(B.232) indeed uniquely specifies a*\j/ because of Proposition B.57.4. 

4. An operator a : Dio) —>■ FI is called self-adjoint when D{a*) = D{a) and a* = a. 

If D{a) = H, and a is bounded, then also D{a*) = H, since \f^{(p)\ < ||a|| || V/|| ||^||, 
so that is bounded for any xj/ G H. Accordingly, for a G B(H), Definition B.70 
reduces to the usual definition (A.15). Furthermore, even if D{a) is merely dense 
in H, if a : D{a) —is bounded in the sense of (B.225) - (B.226), but now with 
\j/ G D(a) instead of xj/ G H, then a has a unique extension to a a bounded operator 
a: H ^ H, whose adjoint a* may be either defined through Definition B.70 as the 
adjoint of a : D{a) —?■ FI, or, equivalently, as the adjoint of the extension a: H ^ H. 
Here, as well as in Definition B.70.2, a general Banach space principle is at work: 
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Proposition B.71. Let V and W be Banach spaces, and let V' be a dense subset 
ofV. Any bounded linear map a' \V' -^W (in the sense of Definition B.32) has a 
unique bounded linear extension a\V ^W, with ||a|| = ||a'||. 

Proof For v G V there is a sequence (v„) in V' with v„ —>■ v. Since a' : V' —>■ VF is 
bounded and (v„) is convergent in V' and hence Cauchy in V, also the sequence 
[a'vn) in W is Cauchy. Since W is assumed complete, we may define av = lim„ 

This limit is easily seen to be independent of the approximating sequence to v, and 
the ensuing map a : V —W is clearly linear. Furthermore, since by (B.5) we have 
||v|| = lim„ ||v„||, if we assume ||v|| = 1 we can take v„ to have unit norm also. 

Once again from (B.5), we also have ||av|| = lim„ ||flVn|| < sup„ ||flVn||, whence 
IIaII < ||fl^||. But for v G V', taking v„ = v we have a'v = av, and hence the bound 
||fl^r|| < ||a|| ||v||, from which ||fl'|| < ||fl||, so that finally ||a|| = ||fl^||. □ 

To complete these basic definitions, we say that an (unbounded) operator a : 
D{a) —>■ // is closed if its graph G{a) = {(\i/,a\j/), \j/ G D{a)} is a closed subspace 
of H(BH, cf. (B.108). Note that in the Hilbert space case it is more appropriate to 
replace the norm (B. 107) on // © // by the equivalent norm 


II(v,w)|| = 7||v|P + HP, (B.233) 

since this alternative norm comes from the canonical inner product on // © //, viz. 

{{v,w),(v' ,w'))h(bh = {v,v')h + {w,w')h- (B.234) 

We now prove an important property of self-adjoint operators: 

Proposition B.72. The adjoint a* of any operator a : D(a) H is closed. In partic¬ 
ular, self-adjoint operators are closed. 

Proof. The proof can be elegantly given in terms of the graph G{a). Defining 


u:H®H ^ H®H-, (B.235) 

uiWuVi) = {-W2,Wi), (B.236) 

it is easy to verify that m is a unitary operator, and that 

G{a*) = u(G{a)-^) = (uG{a))-^. (B.237) 

Hence G{a*) is closed by Proposition B.57.1, and the claim follows. □ 


In the the context of spectral theory, we will see later what the real importance of 
self-adjointness (and, more generally, closedness) is. It is time for some examples. 

Proposition B.73. Let H = f'ijX), with X countable for simplicity, and for f G 
t°(X ) define the multiplication operator mf : H H by 


mfW = fW, 


(B.238) 
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i.e., mf\j/(x) = f{x)\j/{x). Then nif is bounded, with norm, cf. (A.107), 


Ih/ll = ll/ll- (B.239) 

More generally, let H — L^{X) for some (J-finite Borel space (X,Z,]u), and for 
f G L°°(X), define mf in the same way. Then my is again bounded, with norm 

Ih/ll = ll/lir. (B.240) 

Finally, let f :X -^M.be measurable (but not necessarily essentially bounded). Then 

D(mf) = {v/ G L^{X) I fw G L^iX)}. (B.241) 

is dense in L?(X), and if f* = f, the operator my : D{mf) —^ L^{X) is self-adjoint. 

Proof On (A) we have ||/V7||2 < |l/||oo|| and hence ||m^|| < ||/||oo. Assume 
f fO. Then ||/|lcx. > 0, and for any 0 <t < ||/iioo there is X; G A such that |/(x;)| > t, 
so that % = G i^(X) satisfies ||m/'V/r ||2 = |/(•rr)| > t, whence \\mf\\ > t. This 
holds for all 0 < f < ||/||oo, hence ||m/|| > ||/||oo, which yields (B.239). 

To prove (B.240), again assume ||/||^'‘ > 0 and 0 < t < ||/||^“. Then the set 
A^ = {x G A, |/(x)| > tj is measurable, with jl(Xt) > 0. Since (X,X,ju) is c7-finite, 
there is A/ C A^ with 0 < ju(X/) < oo. Take \j/ = Ixj, so that ||/V ^||2 > t|| etc. 

To prove the density of D{mf), for n G N define A„ = {x G A | |/(x) | < n}, so that 
A = U„A„. For each y/ G L^{X) we then have 1^ yr G D{my). Writing = 1^ y/, 
we have (y/, (p„) = Jg^djil y/p, hence (y/, (p„) = 0iff y/ = 0/r-a.e. on A„. This is true 
for each n G N iff i// = 0, so the required density follows from Proposition B.57.4 
In the last claim (where /*(x) = /(x)), the domain D{my) consists of all \j/ G 
L?{X) for which the map (p h-^ fxdlJ.\j7f(p is bounded; by Theorem B.66 this is the 
case iff f\j/ G T^(A), so that D{m*jr) = Dfny). Moreover, (B.232) obviously holds 
for a* = mf (if f takes complex values, then m*^ = mf*, still on D{m*^) = D{mf)). □ 

For quantum mechanics, a key example isH = L? (K) with /(x) = x, i.e., the position 
operator. It then follows from Proposition B.73 that x is self-adjoint on the domain 

D{mjc) = {r G L^(R) I / dxx^\\j/{x)f- < °o}. (B.242) 

Jm. 

See also §5.11. It happens often that a given operator on some domain is not closed 
as it stands, but can be made so by slightly enlarging its domain. Thus an operator 
a : D{a) H is closable if the closure of the graph G{a) in //©// is the graph of 
a closed operator a^, called the closure of a, i.e., G{a)^ = G{a^). The following 
easy lemma is very useful in proving closability (the proof is a definition chase). 

Lemma B.74. Each of the following conditions is equivalent to closability of a: 

1. If is a sequence in D{a) such that > 0, and if its image {a\j/n) converges, 
too, then aV/« —>■ 0. 

2. The domain D(a*) of the adjoint a* (see Definition B.70) is dense in H. 
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The domain DiaT) of the closure of a closable operator a consists of all \j/ G H 
for which there exists a sequence (y/„) in D{a) such that \i/„^ xj/ and a\j/„ converges, 
so that a^\j/ = lim„a\j/„. Finally, if a is closable, then oT = a** and (a^)* = a*. 

An equality a = b between unbounded operators always stands for D{a) = D{b) 
and a = b. Furthermore, a Cb means D{a) C D{b) and b = a on D{a). 

Definition B.75. Let a : D(a) —^ FI (where D(a) is dense) be an operator. 

• If a C a* i.e., if {a(p, \f) = (<p,axj/), ^,1/7 6 L)(a), then a is called symmetric. 

• If a is closable and oT = a* (in which case the closure a^ of a is self-adjoint), 
then a is called essentially self-adjoint. 

It follows from Lemma B.74 that a symmetric operator is closable (because D(a*), 
containing D{a), is dense). For a symmetric operator one has a C a^ = a** C a*, 
with equality at the first position when a is closed, and equality at the second posi¬ 
tion when a is essentially self-adjoint; when both equalities hold, a is self-adjoint. 
Conversely, an essentially self-adjoint operator is symmetric. A symmetric operator 
may or may not be essentially self-adjoint; we will not discuss this problem here. 

As in the finite-dimensional case, the notion of the adjoint allows one to define a 
projection as an operator e:H H that satisfies e^ = e* = e. However, Proposition 
A.8 should be slightly adapted in order to cover the infinite-dimensional case; 

Proposition B.76. There is a bijective correspondence e L between: 

• projections e on H; 

• closed linear subspaces L ofH, 

still given by (A.27) - (A.28), where now {"Ujl/g/ is a basis ofL, and the latter sum 
must be applied to fixed \j/ G H according to Definition B.6 with V = H, i.e., 

= (B.243) 

iei 

Alternatively, without invoking the concept of a basis, one may use the decomposi¬ 
tion (B.203) as proved via Lemma B.58, to define e directly by ey/ = ey/H. 

Proof. The linear subspace L = eH is closed, since e is bounded by Theorem B.68. 

Conversely, note that since L is closed, it is a Hilbert space, so that it has a basis 
by Theorem B.63. The sum in (B.243) then converges by Lemma B.59, and since 


{(p,e\if) = Y,{Vi,\(f}{(p,Vi) = Y,{Vi,(p}{\(f,Vi) = {W,e(p) = {e(p,\l/); 
iei iei 

eV = V7)et>,' = E \l/){VjVi)vj = E(u,', V7)f; = ey/, 

iei ijei iei 

the operator e is a projection (in the second computation we used boundedness of e 
to pull it through the sum). Next, (B.243) is independent of the choice of a basis of 
L, since if is another basis of L, for arbitrary (p € Lwe may compute: 
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((p,'£(Vi, \i/)Vi) - ^ (b,/, v/)b,') = Vi){Vi, Xj/) - J^{(P, b,/)(b,/, V/) 

i€/ i'ei' iei i'ei 

= {(p,\l/)-{(p,Y)=0, (B.244) 

where we twice used (B.215), applied to the Hilbert space L. Hence 

Y,{(P,Vi){Vi,\i/) = (B.245) 

iei i'ei 

Finally, we prove bijectivity of the correspondence e: 

• Given L, by Lemma B.59 (applied to the Hilbert space L), exj/ G L for any xj/ G H, 
whereas if x^r G L, then ex^r = y/ by Theorem B.61 and (B.210). Hence eH — L. 

• Given e, we first note that for any x G = L, by definition we have % = 
for some Xj/ G H, whence ex = e^Xj/ = eXj/ = X- Now pick a basis {b,} of the 
Hilbert space eH, so that in particular eB, = B,. For arbitrary (p,xi/ G H, writing 
(p = exp + {I — e)<p ~ ^11 + xp-^, so that xp\\ G L and hence exp\\ = xp\\, we compute 

{(pKexi/)-Y,{xpKvi){Vi,xi/) = {xp^^,xi/)-{xp^^,xi/)=0-, 

i 

{xp^,exi/) - Y,{(P^,Vi){Vi, Y) = {xp,{l- e)exi/) - Y,{(P, (1 - e)B,)(B,', y) = 0 , 

i i 

where is the first line we used (B.215), applied to the Hilbert space H. □ 

It is easy to see why the sum (B.243) cannot, in general, converge in norm without 
the y/, i.e., in the original (finite-dimensional) form (A.28); it suffices to take e = 1 
(for H = f^(N), for simplicity). Writing e„ = Y!i=i |b,)(b,|, where, for example, 
Vi = Si, for any unit vector Y ni> n, from (A. 18) we have 

m 

> ||(e™-e„)v/|p = ^ \{Vi,Y)\^- (B.246) 

i=n-\-\ 

Taking Y = ^7 for n + I < j < m shows that that \\em — > 1 for all m,n, 

so that (e„) cannot be a Cauchy sequence in B{H). This argument applies to any 
infinite-dimensional subspace L. Therefore, if H is infinite-dimensional we should 
work with at least two notions of convergence within the Banach space B{H) (cf. 
Theorem B.33), which for simplicity we state for sequences (more generally, one 
should define the corresponding topologies in terms of convergence of nets): 

• a„ —a in the norm topology (or uniformly) in B{H) iff || («„ — a) || —> 0. 

• On ^ a in the strong topology in B{H) iff || {a„ — a)Y\\ 0, for each Y &H- 

The strong topology on B{H) is also called the strong operator topology, in order 
to distinguish it from the strong topology on H itself (which, confusingly, is another 
name for the norm topology) in terms of which it is defined. Similarly, the weak 
topology on H (cf. §B.12) defines a weak operator topology on B{H), as follows: 

• a„^ a weakly on B{H) iff {g), {a„ — a)Y) 0, for each xp,Y ^ H- 
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In decreasing strength we have ‘norm - strong - weak’, and we show that this trio is 
distinguishable on// = £^(N) (and hence on any infinite-dimensional Hilbert space); 

• Let an^f{x) = 0 for X = whilst a„V/(x) = V^(-r) for x > n. In other words, 

if V/ = (y/i, V/2) • ■ '^hen a„V/ = (0,... ,0, i/n+i, Wn+i, ■ ■ ■) with n zeros. Hence 

oo 

\\anvf= E 

A=n+1 

so that ||a„ V/|| —>■ 0 as n —>■ oo in order for \j/ to be in /^(N). Thus a„ —0 strongly 
(and hence also weakly). If (a„) were to have a norm limit, it therefore would 
have to be zero, too, but since ||a„|| > ||a„V/|| for any unit vector \j/, taking e.g. 
W — ^n+i, we have ll««ll > 1 for any n and hence (a„) cannot converge in norm. 

• A slight variation on this example is a„\lf{x) = 0 for x = (once again), but 

now an'^f{x) = V7(x — n) for x> n, or, equivalently, OnW = (0,..., 0, y/i, \j/ 2 , ■ ■ .) 
with n zeros. This time, we have ||a„ = || </||, so to begin with, a„ —> 0 strongly 

is excluded. However, {(p,a„\l/} = (p(x + n)\j/(x), so lim„^oo(^,a„V7) = 0: 
to see this, take N <°° fixed and use Cauchy-Schwarz to estimate 


|(^,a„V7)|< ^ ^(x + n)v/(x)+ ^ (p(x + n)\j/{x) 

x=l x=N+l 

<M\( f |(p(x)|2^ +m( £ \w{x)A ■ (B.247) 


yx=//+l 


U-=A^+1 


Letting N ^ and then °° yields {(p,a„\i/) 0, so that 0 weakly. 

But (a„) has no strong limit (for if it existed, it would have to be zero, too). 

It is clear from Theorem B.33 that B{H) is sequentially complete in its norm 
topology. This is true also in the weak and strong operator topologies: 

Proposition B.77. Let (an) be a sequence in B(H). 

1. If {an'^t) converges in H for each \j/ G H, then the operator a : H H defined by 
a\j/ = lim„ a„\j/ is bounded (and hence a„ ^ a strongly, where a G B{H)). 

2. If {{(p,a„\j/)) converges in C for each (p,\j/ G H, then there is an operator a G 
B{H) such that an ^ a weakly (and hence an ^ a weakly, where a G B(H)). 

It is instructive to prove this, using two results of independent interest. 

Theorem B.78. Suppose V is a Banach space, W is a normed space (not necessarily 
complete), X is an arbitrary set, and {ax}xex is some family of operators in B(V,W) 
indexed by X. If the family is pointwise bounded in that 

sup{||a;cv||,x G A} < oo (v G y), (B.248) 

then the family is uniformly bounded in that 

sup{||fl;,||,xGA} <oo. (B.249) 
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This is the Principle of Uniform Boundedness or Banach-Steinhaus Theorem. 

Proof. If W is not complete, use its completion in what follows. Define t°{X,W) 
to be the set of all bounded functions f : X ^ W, i.e., those function such that 
sup{||/(x)|l,x G X} < °°, with pointwise operations. This is easily checked to be 
a Banach space itself in the natural norm ||/||oo = sup{||/(x)||,x G X} (using the 
auxiliary functions f : X ^ C defined by each / G £'”(X,W) as f{x) = ||/(x)||, so 
that ||/||oo = ||/||oo, one may largely reduce the proof to the ordinary £°°{X) case). 

For fixed v G V, define f^-.X^Why fv{x) = ax{v). By assumption, / G 
£°°{X,W), so we may define an operator F :V ^ £°°{X,'W) by F{v) = /y. We now 
show that the graph G{F) is closed: if v„ —v in V and Fvn g in £°°{X,W), then 
since uniform convergence implies pointwise convergence, for each x GX we have 

g{x) = \im(Fvn){x) = lim/y„(x) = limfl;,v„ = fl;,limv„ = a^v = fvix) = {Fv){x). 

n n n n 

Thus g = Fv, and hence G{F) closed. By Theorem B.37, F is bounded, so that: 
ll^ll =sup{||/v||~,vG y,|lv|| = 1} = sup{||fl^v||,vGy,|lv|| = \,xGX} 

= sup{||a;t||,xGX} <oo. □ 

This gives part 1 of Proposition B.77: since lim„fl„v/ exists, sup„{||a„V/||} < °° for 
each \j/, hence sup„{||a„||} < °°. Since a„V/ —>■ axj/ implies ||flHi/|| —>■ ||flV^||, cf. (B.5), 

\\a\ir\\ =lim||a„V7|| < lim||a„||||i/H < sup{||a„||}||y/'H, (B.250) 

n n „ 

SO taking the supremum over all unit vectors \j/ gives ||fl|| < 

As to the second part, suppose On ^ a weakly. Since ({<p,a„\(f}) converges 
for G H, we have sup„{|(^,fl„V/)|} < o°. Using (B.222), this is the same as 
^ °° <p G H, so using Banach-Steinhaus with V = H*, 

X = N, and Ox = fa„\i/, we find sup„{||/a„y,||//*} < “o. By Theorem B.66, this 
gives sup„{||a„V/||} < and hence, via a second application of Theorem (B.78), 
sup„{||a„||} < oo, or ||a„|| <C <°° for all n, as in the case of strong limits. 

This time we have to do a little more work to construct the limit operator a. This 
requires a second lemma, which generalizes Proposition A.23 to general Hilbert 
spaces. To this effect, we say that a sesquilinear form B : H x H is bounded if there 
is a finite constant C such that \B{(p,\if)\ < C||^|| V/|| for all (p,\jf G H. 

Proposition B.79. The relation B{(p,\j/) = {(p,a\j/) provides a bijective correspon¬ 
dence between bounded (hermitian/positive) sesquilinear forms and bounded (self- 
adjoin/positive) operators a G B{H), cf. Proposition A.22.1. 

Like Proposition A.23, this is a trivial consequence of Theorem B.66. 

To finish the proof of Proposition B.77.2, define B((p, xj/) — lim„((p,a„\j/), so 

|B(^,r)| <lim||a„||||^||||V/|| <sup||a„||||^||||V/|| <C||^||||V/||. (B.251) 

^ n 

Hence B is bounded, and Proposition B.79 gives the weak limit a G B{H). □ 
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B.14 Basic spectral theory 

In linear algebra, which in our context means the theory of operators on finite¬ 
dimensional Hilbert spaces H, the spectrum a{a) of an operator (i.e., a linear map) 
a \ H H was defined as the set of eigenvalues of a. This led to the Spectral Theo¬ 
rems A. 10 and A. 15. However, as soon as dim(//) = simple examples show that 
even bounded operators may have no eigenvectors (and hence no eigenvalues) at all. 
For example, take H ~ and f{x) = x, with associated (bounded) multiplica¬ 

tion operator a = mf = nix, cf. (B.238); this is just a bounded version of the position 
operator of quantum mechanics. Then the eigenvalue equation OxY — Xy/ implies 
Jq dx\x — X\^\yf{x)\^ = 0, which holds iff |x —A||v7(x)| =0 a.e. Since |x —A| is 
nonzero a.e. for any A G C, this implies (//(x) = 0 a.e. and hence y/ = 0 in L^(0,1). 
More generally, taking H = L?'{W^) and / G C*(]R^), a similar argument shows that 
the multiplication operator m/ has eigenvalue A G C whenever the equality f{x) = A 
holds on a set of positive (Lebesgue) measure. Therefore, if / varies sufficiently, 
then nif has no eigenvalues at all (e.g., in = 1, / G ([0,1]) with f'{x) ^ 0 a.e.). 

Even amidst his magnificent oeuvre, covering most of mathematics, it was one 
of Hilbert’s most prophetic insights that finite-dimensional spectral theory could not 
merely be rescued, but also greatly enriched, by defining the spectrum as follows: 

Definition B.80. Let H be a Hilbert space. The spectrum a{a) of a G B{H) consists 
of all A G Cfor which the operator a — X : H ^ H is not bijective. The complement 

p(fl)=C\a(fl) (B.252) 

of the spectrum in C is called the resolvent of a, i.e., z G p{a) iff a — z is invertible. 

Here a — A = a — A • Ih, where 1// is the unit operator on H, and by ‘bijective’ and 
‘invertible’ we a priori mean: injective and surjective. This set-theoretic notion of 
invertibility is considerably strengthened by Corollary B.35, according to which the 
set-theoretic inverse of a — A : if it exists for a G B(H), is automatically in 

B{H). Consequently, we may equivalently say that A G (7(fl) if a —A is not invertible 
in B{H). This means that if z G p{a), then the equation {a — z)y/ = (p for \j/ G H: 

• actually has a solution, since (a — z) is surjective-, 

• has a unique solution, for (a — z) is injective', 

• has a unique solution that continuously depends (p, as {a —zff^ is bounded. 

Thus Definition B.80 becomes a special case of the following purely algebraic idea: 

Definition B.81. Let A be a (complex) algebra with unit. The spectrum <j{a) of 
a G A consists of all X G C/or which the operator a — X is not invertible in A. 

The notation (B.252) also extends to this case. This generalization is especially pow¬ 
erful when A is a Banach algebra, and, particularly a C*-algebra, cf Definition C.l. 
The latter case actually incorporates Definition B.80: 

Proposition B.82. For any Hilbert space H, the set B{H) of all bounded operators 
on H is a C*-algebra with unit in the operator norm (A. 18) 
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The proof of Proposition A.7 goes through unchanged. In a different direction: 
Proposition B.83. Let A = C{X), where X is a compact Hausdorff space. Then 

(j(/)=ran(/). (B.253) 

Proof. Since multiplication mC{X) is pointwise, if f — X ■ \x has an inverse, it must 
be \/{f — X- lx)- This function exists (and is continuous) iff X ran(/). □ 

Theorem B.84. Let A — B{H) or, more generally, a unital C*-algebra, or, even more 
generally, a Banach algebra with unit 1^ (cf. Definition C.l). Then the spectrum 
<j{a) of any a G A is a nonempty compact subset of C. 

Furthermore, defining the spectral radius of a GA by 

r{a) = sup{|A|,A G O’(a)}, (B.254) 

for general unital Banach algebras we have 

r(fl)<||a||, (B.255) 

as well as Gelfand’s spectral radius formula 

r(fl) = lim ||a"||*/". (B.256) 

If a G Asa is o self-adjoint element of a unital C*-algebra, such as A = B{H), then 

r(fl) = ||fl|| {a* = a). (B.257) 

Proof. The claim about the spectrum obviously follows from the following facts: 

1. ( 7 (fl) is a bounded subset of C. 

2. is a closed subset of C. 

3. (7(fl) is a nonempty subset of C. 

Eq. (B.255) is equivalent to the implication |A| > ||a|| ^ X G p{a). For A 7 ^ 0 we 
have {a — X) — X{{a/X) — \ ), so, rescaling a if necessary, we only need to show 
that if ||a|| < 1, then 1 G p{a). Indeed, in that case the geometric series for a 
converges absolutely and hence (A being a Banach space) converges, with 

n 

= (!-«)-'; (B.258) 

k=0 

the proof is virtually the same as for complex numbers. Thus 1 G p (a). 

Fact 2 is equivalent to the set A* of of invertible elements in A being open in A. 
Indeed, for given a G A*, take a G A for which ||fe|| < ||^'. This implies 

||a-'fo|| < ||a-'|| ll^ll < 1. (B.259) 
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Hence by (B.258) for ||a|| < 1, the operator a + b = a(l +a^^b) has an inverse, 
namely (1 + . Taking e < ||a^^||^\ it follows that all c G A for which 

||a — c|| < e lie in A* (which is therefore an open subset of the metric space A). 

For the third claim, take a G A and define /: C —^ A by f{z) =z — a. Since 

||/(z + 5)-/(z)|| = 5, 

we see that / is continuous (take 5 = e in the definition of continuity). By part 2 
of the proof, (A*) is open in C. But (A*) is the set of all z G C where z — a 
has an inverse, so that /^'(A*) = p(a). This set being open, its complement a{a) 
is closed. Now define 


g:p{a)^A-, (B.260) 

z^{z-a)-'. (B.261) 

For fixed zo G p(fl), choose z G C such that |z —zo| < || (a — zo)^^||^^- From part 2 
of the proof, with a replaced by a — zo and c replaced by a — z, we see that z G p (a), 
as ||a — zo — (a — z)|| = |z —zo|- Moreover, because 

||(zo-z)(zo-a)^'|| = |zo-z| ||(zo-a)^^|| <1, (B.262) 


the power series 


^o-al^Q\zo-a 


k 


(B.263) 


is absolutely convergent and hence convergent for n —oo. By (B.258), the limit 
n oo of this power series is 


1 y f zQ-Z 
zo - a Vzo - a 


k 


1 

ZQ-a 



ZQ-Z 
zo - a 


= —=g(z). (B.264) 

z — a 


Hence 

= E (^0 -2)^(zo (B.265) 

k=0 

is a norm-convergent power series. For z 7 ^ Owe write ||g(z)|| = |z|^* || (1a — a/z)^* || 
and observe that lim^^^ 1a — a/z = 1a, since lim^^oo ||a/z|| = 0. Hence we obtain 
\im^^^{\A-a/zY^ = 1a, and 


lim||g(z)||=0. (B.266) 

Z —>-00 

Let (p G A*; since (p is bounded, eq. (B.265) implies that the function g^p : z ^ 
<p(g(z)) is given by a convergent power series (i.e. is analytic), and (B.266) implies 

limg^(z)=0. (B.267) 

7-^00 ^ 
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Now suppose that (7(a) = 0, so that p(a) = C. The function g, and hence g(p, is 
then defined on C, where it is analytic and vanishes at infinity. In particular, g^ is 
bounded, so that by Liouville’s Theorem of elementary complex analysis it must be 
constant. By (B.267) this constant is zero, so that g = 0 by Corollary B.45. This is 
absurd, so that p(a) C, and hence (7(a) ^ 0. 

We now prove the spectral radius formula (B.256). For |z| > ||fl|| the function g, 
defined in (B.260) - (B.261) has a norm-convergent power series 

^W = -E(“) • (B-268) 

On the other hand, we have seen that for any zG p(a) one may find a zo G p(a) such 
that the power series (B.265) converges (i.e. in norm). If |z| > r(a) then z G p(a), 
so (B.265) converges for |z| > r(a), uniformly in z. Therefore (by the theory of 
analytic functions taking values in Banach spaces), eq. (B.268) is norm-convergent 
for |z| > r(a), too, which in turn implies that ||a"||/|z|” < 1 for large enough n (proof 
by contradiction). Since this is true for all z for which |z| > r(a), we must have 

lim sup ||a"||'/" < r(a). (B.269) 

To derive a second inequality towards (B.256), we use the spectral mapping prop¬ 
erty for polynomials, which states that for any (complex) polynomial p on C, 

C7(p(a)) = p((7(a)) = {p(X) I X G a(a)}. (B.270) 

Given some polynomial p of degree n (in a variable z) and some fixed A G C, let 

^(z) =p(z)-A =co]~[(z-q), (B.271) 

k=l 

for some cq, ... G C. Hence by (A.53) - (A.55), we have q(a) = CQYYk=\(‘^ — Ck)- 
Now an operator b = bo - ■ ■ bn is invertible iff each factor bk is invertible (in which 
case = bn^ ■ • -^o *), so A G a(p(a)) iff some q G (7(a) (where k > 0, as cq 0), 
which is true iff q(ck) = 0, which holds iff A = p(ck)- This proves (B.270). 

To conclude the proof of (B.256), we note that since (7(a) is closed, there is 
A G for which |A| = r(a). Since A"* G a (a'") by (B.270), one has \X"'\ < ||a™|| 
by (B.255). Hence > |A| = r(a). Combining this with (B.269) yields 

limsup||a”||'/”<r(a)<||a'"||^/“ (mGN). (B.272) 

Hence the limit must exist, and lim„^oo ||a"||^/” =infm = r(a), i-C-, (B.256). 

Finally, given axiom (C.2) for C*-algebras (which include B(H) by Proposition 
A.7 and Theorem B.33), eq. (B.257) follows from (B.256): for self-adjoint a, eq. 
(C.2) reads ||a^|| = ||fl|p, so if we take the limit in (B.256) along the subsequence 
of even numbers (as we are entitled to, given convergence), we obtain (B.257). □ 
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We may also generalize Definition B.80 in a different direction, where we allow 
a : D{a) —^ to be unbounded. In that case, there is room for some ambiguity, as 
a possible set-theoretic inverse of a — z, if it exists as a (necessarily linear) map 
(a — D{a) is no longer guaranteed to be bounded. By the argument 

preceding Definition B.81 this would, of course, be desirable, which motivates: 

Definition B.85. Let H be a Hilbert space, and let a : D{a) ^ H be a possibly un¬ 
bounded operator (always by definition with dense domain). 

1. The resolvent p{a) consists of all z G C for which a — z '■ D(a) —> H has a 
bounded (linear) inverse (a — z)^^ : H — >■ D{a), so that (a — z)^^ G B(H). 

2. The spectrum (7(a) = C\p{a) is the complement of the resolvent (i.e. in C). 

This provides further motivation for requiring an unbounded operator to be closed: 
Proposition B.86. Let a : D(a) H be a possibly unbounded operator. 

• If a is closed, then zG p(a) iff o — Z has a set-theoretic inverse. 

• If a is not closed, then p (a) = 0. 

Proof. The graph G(a^*) in //©// is the image of G(a) under the linear homeo- 
morphism (xj/ijXj/z) i-G (i/ZZ) V^i), hence if a is closed, then a^^ is closed and hence 
bounded (cf. Theorem B.37). Similarly, if G{a) is not closed, then cannot 

be closed either, and hence a^ * cannot be bounded. Likewise with a-^ a —z. □ 

Thus spectral theory always deals with closed operators a, like self-adjoint ones. 
We now show that Definition B.80 is compatible with our earlier §A.4. 

Proposition B.87. Let V be a finite-dimensional vector space and let a :V be 
a linear map. Then a is injective iff it is surjective. 

Proof. This follows from the elementary fact that for any linear map a:V one 
has ran(a) = V /ker(a). Now if V = W is finite-dimensional one has V = C" (on 
choice of a basis), and one may simply count dimensions to infer that 

dim(ran(a)) = n — dim(ker(a)). 

Surjectivity of a then yields injectivity and vice versa: we have dim(ran(fl)) = n iff 
dim(ker(a)) = 0 iff ker(a) =0. □ 

Note that his proposition yields the very simplest case of the Atiyah-Singer in¬ 
dex theorem, for which these mathematicians received the Abel Prize for 2004. We 
define the index of a linear map a : V —W as 

index(a) = dim(ker(fl)) — dim(coker(fl)), (B.273) 

where cokem(fl) = W /ran(a), provided both quantities are finite. If V and W are 
finite-dimensional. Proposition B.87 yields the baby index theorem 

index(fl) = dim(y) — dim(W). (B.274) 
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In particular, if V =W, then index(fl) = 0 for any linear map a (in general, the 
index theorem expresses the index of an operator in terms of topological data; in 
this simple situation the only such data are the dimensions of V and W). 

Corollary B.88. If a is an operator on a finite-dimensional Hilbert space, then the 
spectrum (7 (a) of a is the set of its eigenvalues. 

Proof It immediately follows from Proposition B.87 that a — z is invertible iff z is 
not an eigenvalue of a. □ 

Returning to Definition B.80, we see that if X is an eigenvalue of a (in that, as 
in finite dimension, there exists a nonzero vector xj/ G H for which a\j/ — Xxj/), then 
X € O’(a) (for fl — A) is not even injective, let alone invertible). Thus we may define: 

• the point spectrum Op{a) of a as the set of its eigenvalues, so that (Jp(a) C O’(fl); 

• the continuous spectrum, which (if it exists) is the remainder of G{a), i.e., 

(Jc{a) = (7(a)\(7p(a). (B.275) 

If (7(fl) = CJp(a), we call a{a) discrete. The example at the beginning of this section 
shows the opposite case, viz. Gp{ax) = 0 and odax) = [0, !]• This follows from: 

Proposition B.89. Let H ~L^{X,E,pL) for some a-finite Borel space {X,E,p) such 
that ft (A) > Ofor each open A GX, and let f G C{X). Then 

cj(my) = ran(/)^. (B.276) 

Cf. Proposition B.73. More generally, let f : X ^ C be (Borel) measurable. Then 

a{mf) =ess-van{f), (B.277) 

wgere the essential range ess-ran(/) off consists of all zGC such that 

\/e>0:p{{xGX:\f{x)-z\<e})>0. (B.278) 

Proof. The second claim implies the first, for ess-ran(/) = ran(/)^ if / G C{X). 

To prove the second claim, we use the functions (pn = from the proof of 
Proposition B.73, where y/ G H is arbitrary. If 0 ^ a{mf), then mf is invertible, so 
there isb G B{H) such that fb(p„ — (p„. This implies that f{x) f 0 a.e. on with 
b(p„ = mijftpn. Because n G N is also arbitrary and X = this gives f(x) f 0 
a.e. on X, and since the linear span of the is dense in H, we obtain b = m^j^, 
provided b = mf^ exists (which should not surprise us, for = rnyf). From 

(B.240), with / 1//, we then obtain ||1//||^^ = ||^i//|| < °° (from 0 G p(mf)). 

The point is that ||1//|1^*'^ < °° iff there is e > 0 such that |/(x)| > e almost 
everywhere, i.e., p{{x G X : |/(x)| < e}) = 0. The negation of this condition states 
that Ve > 0 : p{{x G X : |/(x)| < e}) > 0, that is, 0 G ess-ran(/). Therefore, we have 
shown that 0G(j(mf)iff0G ess-ran(/); if f G C(X), this is the same as 0 G ran(/)^. 

To finish, note that mf — X- Ih = nif_x, where f—Xis the function x /(x) — A. 
This gives A G G{mf) iff 0 G G{mf_x), which is true iff A G ess-ran(/). □ 
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Corollary B.90. If lJ.{f = X) = 0 for all A € C, then ap{mf ) = 0. 

Thus the combination Gp{a) =0 and Gc{a) f 0, which is the opposite of the finite¬ 
dimensional situation, is very well possible. To shed further light on the still some¬ 
what mysterious idea of a continuous spectrum, we now present Weyl’s theory of 
the spectrum. We say that a possibly unbounded operator a : D(a) H is normal 
when D{a*) = D(a) and ||(a!*i/|| = ||flV^|| for each xff G D(a); if a is bounded, this is 
equivalent to the familiar definition a*a = aa*. Self-adjoint operators are normal. 

Theorem B.91. Let a : D{a) -G H be normal. Then X G cy{a) iff there exists a se¬ 
quence if/n) of unit vectors in D{a) such that 

lim ||(a-A)v/«|| =0. (B.279) 

n^oo 

Of course, this is useful only as a new characterization of A G CJc(a); if A G O'p(a) 


one may simply take 1/4 = 1 // for all n, where ai// = Ai//. For 

a simple example, take 

H = L^(K); 

(B.280) 

a = my (/gC(K)), 

(B.281) 

= /(^o) (^0 G R), 

(B.282) 

so that A G ran(/) C Gc{mf) = a{mf), and 


i//„(x) = 

(B.283) 


Then || V 4 || = 1 and lim„ || (m/-— A)v/n|l = 0, although ( 1 / 4 ) has no limit in L^(K). 

Proof One direction is easy by reductio ad absurdum: if the given sequence ( 1 / 4 ) 
exists yet A G p(fl), then, since (a — A)^* would exist and would be bounded, for 
any sequence {(pn) inH, (p„^0 implies (a ——>■ 0 , so taking (p„ = (a — X)\f/n, 
we find that (a — A) 1/4 0 implies \ff„ 0. Therefore, the assumption || V^n|| = 1 

cannot be true, and hence A ^ p (o'(a), which is to say that A G (7(fl). 

The converse direction requires two instructive lemmas of independent interest. 

Lemma B.92. Let a G B{H) (or, more generally, let a : D{a) H be closed). Then 

ian{a)- = ker(fl*)-L; (B.284) 

ran(fl)^ = ker(fl*). (B.285) 

In particular, we have ran(fl)^ = H iffkev(a*) = {0}. 

Furthermore, we say that a is norm-positive (a neologism!) if there exists (X >0 
such that ||flV^|| > o:|| V/|| for each ^ GH (or each if/ G D(a)). Then: 

1. If a is norm-positive, then ran(fl) is closed. 

2. The operator a is invertible iff a is norm-positive flnc/ker(a*) = {0}. 

3. A normal operator is invertible iff it is norm-positive. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



584 


B Basic functional analysis 


The last point provides the remainder of the proof of Theorem B.91, for if X G <y{a), 
then fl — A is not invertible, so for each e = Ijn there is a unit vector xj/^ G H (or 
\j/ G D{a)) such that || (a — A) V/„|| < 1/n, and hence we have our sequence (v/„). □ 

It remains to prove Lemma B.92. Eqs. (B.284) - (B.285) are easy exercises, using 
(B.204). For clause 1, if (^„) is a Cauchy sequence in ran(a) converging to (p G H, 
then (p„ = a\j/n for some !//„ G D{a). Since Wxj/m — \j/„\\ < a^^\\<p„ — (pm\\, the se¬ 
quence (xj/n) is Cauchy, too, and if xj/, then (p„ -G- ay/ — (p, so (p G ran(fl); in 
the unbounded case this is because a is closed. For clause 2, if a is invertible, then 
for y/ G D{a), we have ||v/|| = ||a^^flV/|| < ||a^'|| Hav/H, since is bounded, and 
therefore a is norm-positive with (for example) a = ||a^* ||^*. Moreover, invertibil- 
ity implies surjectivity, i.e., ran(a) = H, and hence ker(fl*) = {0} by (B.284). 

Conversely, if a is norm-positive, then it is trivially injective, and if ker(a*) = 
{0}, then ran(a)^ = H, again by (B.284). But since a is also norm-positive, 
ran(fl)^ = ran(fl) so ran(fl) = H and a is surjective, too. Clause 3 now also follows, 
since for normal operators a we have ker(fl) = ker(fl*), so a being norm-positive 
implying ker(a*) = { 0 } in any case, now also implies ker(fl) = { 0 }. □ 

The same lemma yields crucial information on spectra of self-adjoint operators. 

Theorem B.93. If a : D{a) —> H is self-adjoint, then (7(a) C K, and if two eigenval¬ 
ues X,X' G (Jpla) are different, then corresponding eigenvectors are orthogonal. 
Furthermore, for each z GC exactly one of the following possibilities applies: 

• ZG p{a) iffian{a — z) = H; 

• zG (Jc{a) iffran{a — z)^ =H but ran(a — z)f^ H; 

• z G (Jp{a) iffian{a —z)^ f H. 

Proof If a* = a then {xj/,axi/) is real, so |(v/, (a —z)i/)| > |Im(z)||| y/jp for any zG C. 
Combined with Cauchy-Schwarz, this gives the inequality 

||(a-z)v/'|| > |Im(z)|||v/||. (B.286) 

Therefore, for z G C\]R the normal operator a — z is norm-positive, and hence invert¬ 
ible by Lemma B.92.3, so that ( 7 (a) C M. Next, if ax^r — Xxj/ and axir' — X'xir', 

V^> V^') - - «) = 0- (B.287) 

given that A,A' G K and assuming X' f X and a* = a. 

Furthermore, for z G C\]R, we have z G p(a) and hence trivially ran(a — z) = 
//; conversely, the latter property states surjectivity of a — z, whilst (B.286) yields 
injectivity, so jointly, z G p(a). For z G K, assuming ran(a — z) = H, eq. (B.285) 
yields ker(a* — z) = {0}, but since a* = a and z = z, this is just injectivity of a — z, 
whence once morez G p(a). Similarly, ifzGM, thenran(a — z)^ y^//iffker(a —z) f= 
{0}, which yields the third case z G (7p(a). The middle case is all that remains. □ 

This result reconfirms Corollary B .88 to the effect that continuous spectrum cannot 
occur if dim(//) < oo, since in that case (where linear subspaces are automatically 
closed) the second scenario in Theorem B.93 is impossible. 
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Although he did not live to see it, on Hilbert’s viosnary Definition B.80 of the spec¬ 
trum, part 1 of Theorem A. 15 still holds verbatim even if H is infinite-dimensional: 

Theorem B.94. Let H be a Hilbert space, suppose a G B{H) is self-adjoint, and let 
C* (a) be the C*-algebra generated within B{H) by a and In (that is, the intersection 
of all C*-algebras containing a and Ih)- Then C*{a) is commutative, and there is a 
(necessarily isometric) isomorphism of (commutative) C*-algebras 

C((j(a))4c*(fl),/^/(fl), (B.288) 

which is unique if it is subject to the following conditions: 

• the unit function la(a) : 1 corresponds to the unit operator Ih; 

• the identity function id^.^^) \ X X is mapped to the given operator a. 

The map / f{a) is called the continuous functional calculus. In particular, 

{tf + g){a)=tf{a)+g{a)- (B.289) 

{fg){a)=f{a)g{a)- (B.290) 

f{a)*=r{a). (B.291) 

It is worth mentioning that by Theorem C.62 (cf. Appendix C) an isomorphism 
of C*-algebras is automatically isometric, but in this case the equality 

ll/(«)ll = ll/l|oo, (B.292) 

acts as a lemma in the proof that (B.288) is an isomorphism, so we need to prove it 
explicitly; cf. (B.225) for the left-hand side, and (1.24) for the right-hand side. 

Note that Theorem B.94 is even true for the larger class normal bounded opera¬ 
tors a (which might even be defined by the property that C* (a) is commutative), but 
for applications to quantum mechanics it is sufficient to deal with the self-adjoint 
case (which even mathematically is not a restriction, as it implies the normal case). 

Proof We repeat (A.52) and (A.53) - (A.55), obtaining a map / 1 —>■ /(a) defined for 
polynomials f on K, restricted to a{a) C M. The *-algebra P* (a) of all polynomials 
in a is dense in C*(a) by definition of the latter, since one cannot have a smaller 
C*-algebra in B{H) containing a and Ih than the norm-closure of P* (a). In order to 
take advantage of this, we need the following lemma. 

Lemma B.95. For any a G B{H) and any polynomial p on C, we have 

(y{p{a)) = p{(y{a)) = {p{X) I X G (B.293) 

||a|| = y/r(a*a), (B.294) 

see (B.254). In particular, if a* = a, then ||fl|| = r{a), cf. (B.257). 
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This is part of Theorem B.84, but we now give a direct proof of the second part. We 
first note that if a* = a, then either ||a|| or —||a|| (or both) are in a{a). To show this, 
take a sequence (!//„) of unit vectors in H such that lim„ ||av/„|| = ||fl||. Then 

= \\a^Wnf + \\a\\^-2\\af\\aWnf 
<2||flf-2||a||2||flv/„f, (B.295) 

so that lim„ || {a^ — ||a||2)y/„||2 = 0, and hence ||fl||2 g a{a^) by Theorem B.91. But 
part 1 of the lemma gives a{a^) = {X^ \ X G O’(a)}, so that ±||a|| g (7(a). 

The second observation is that, for general a g B{H), if some z g C has |z| > ||a||, 
then z g p(a). This follows from (part 1 of) the proof of Theorem B.84. Thus we 
firstly have r(a) > ||a|| {a* = a), and secondly (for all a), r(a) < ||a||. 

Using Lemma B.95, we now prove that (B.292) holds for real polynomials f = p: 

||p(a)|| = r{p{a)) = sup{|A|,A g (7(p(a))} = sup{|A|,A g p{(y{a))} 

= sup{b(-l)|,A g (7(a)} = ||p||<». (B.296) 

The case of complex polynomials p follows from this, since, using (B.289) - (B.291), 

||p(a)|p = |b(a)>(a)|| = \\\p\\a)\\ = |||pp|U = ||p|li. (B.297) 

Thus we have proved the isometric *-algebra isomorphism P{a{a)) = P* (a), where 
P((j(a)) and P*{a) are the canonically normed vector spaces of all finite polyno¬ 
mials in f g (j(a) and in a g B{H), respectively. Neither is complete (when H is 
infinite-dimensional and a ^ 0 ), but given isometricity, it is easy to pass to their 
completions, which by Weierstrass and by definition are C(( 7 (a)) and C*(a), re¬ 
spectively. Thus for / g C((7(a)) we find a sequence {p„) in P{a{a)) such that 
Pn^ f (from which it follows that {p„) is Cauchy in C((j(a))), and define 

/(a) = limp„(a); (B.298) 

n 

this limit exists because \\pn{a) — Pmio)\\ = \\Pn —Pm|l~, so that {pn{a)) is Cauchy 
in the Banach space C*(a). Furthermore, if pj, —/, and f{a) = \im„p'„{a), then 

||/(a)-/'(a)|| =lim||p„(a)-p;,(a)|| =lim||p„-p;,|l„ = 0, (B.299) 

n n 

so f'{a) = f{a). From (B.296) - (B.298) and continuity of the norm—i.e. ||/(a)|| = 
lim„ ||p„(a)||, which gives d (B.292)—the map / i-g- f{a) is isometric and hence 
injective on C((7(a)), and the above construction trivially makes it surjective. 

Finally, the properties (B.289) - (B.291) follow from (A.53) - (A.55) by con¬ 
tinuity. These properties also imply the uniqueness of the map / 1 —/(a) given the 
conditions states in the theorem, because these conditions and (A.53) - (A.55) define 
the map on P{a{a)) and hence, by continuity, also on C(( 7 (a)). □ 
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For a nice reformulation of Theorem B.94 in terms the Gelfand spectrum, cf. §C.4. 
For later use (cf. Proposition B.98 below) we add a related result. 

Lemma B.96. If a G B(H) is self-adjoint, then 

||fl|| =sup{|(v/,av/)|,rG//,||r|| = 1}. (B.300) 

In particular, if a,b G B{H) are both positive and a <b, then ||a|| < ||/t||. 

Proof Define the numerical range v{a) of an arbitrary a G B{II) as 

v(a) = {(v/,av/),r G//,||V7|| = !}■ (B.301) 

Clearly, if A G Opia), then X G v(a). If A G O'e(a), then, in the notation of Theorem 
B.91, by Cauchy-Schwarz and normalization of )//„ we have 

|(V/«,(a-A)v/«)| < ||(a-A)v4||. (B.302) 

Hence in view of (B.279) we have 

lim(v/„,av/«) =A. (B.303) 

n^oo 

So A G v((7)^, whence a(a) C v{a)^, and hence r{a) < sup{|A|,A G v(fl)}. From 
Cauchy-Schwarz, in (B.301) we have \ {\i/,a\j/) \ < ||a||. If also a* = a, by (B.300), 

||a|| = r(fl) < sup{|A|,A G v(fl)} < ||fl||. 

Hence we have equalities everywhere, and (B.300) follows. □ 

Generalizing parts 2 and 3 of Theorem A. 15 to the infinite-dimensional case 
requires some motivation. To this effect, note that the continuous functional calculus 
a I—>■ /(a) is positive, i.e., if / > 0 pointwise, then/(a) > 0 in that (v/,/(a)v/) > 0 for 
each y/ G H. Indeed, we have / > 0 iff / = g*g for some g G C((7(a)), with g*(x) = 
g(x) as usual, and hence, by (B.290) - (B.291), f(a) = g(a)*g(a) and therefore 
{yf,f{a)yr) = ||g(fl)|P > 0. By Corollary B.17, if yr G // is a unit vector, there is a 
probability measure pip on a(a) such that for each / G C((7(fl)), 

{W,fia)w) = [ dp^f. (B.304) 

Ja(a) 

The key to the envisaged generalization of Theorem A. 15 is that the integral on the 
right may actually be defined for a far larger class of functions than C(o’(fl)); cf. 
(B.29). This suggests that the expression f{a) on the left-hand side should similarly 
be generalized to a larger class of functions /. However, the spaces considered 
in §B.6 are defined on the basis of some measure /r; since p^i/ in (B.304) varies with 
\j/ and /(fl) should be independent of y/, it is appropriate to use the space I^{a(a)) 
of bounded functions / ; a{a) -G C that are measurable with respect to the Borel 
a-algebra on a{a) (which consist of the Borel sets on K intersected with a{a)). 
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Since both boundedness and measurability are preserved under uniform limits (mea¬ 
surability even being preserved under pointwise limits), ^i§(a{a)) is complete in the 
sup-norm, which makes it a commutative C*-algebra (under pointwise operations). 
Among all functions in .^(a(fl)), we will be particularly interested in the charac¬ 
teristic functions 1 a, where A C G{a) is measurable. The expressions 

sa = l^(a), {A C CJ(fl)); (B.305) 

= eAnaia), (A C R); (B.306) 

a = l{A}(a), G O-p(a)), (B.307) 

to be defined below, where A is a Borel set (and e^—Ohy convention), are the spec¬ 
tral projections of a (which are of fundamental importance to quantum mechanics). 

Lemma B.97. Any positive function f G ^3S{G(a)) is a pointwise limit of some 
monotone increasing bounded sequence (fn) in C{(7{a)), written /„ /. That is, 

0< f\{x) < ■■■ < fn{x) < fn+\{x) < ••• <C- (B.308) 

fix) = lim fn(x), xG ala). (B.309) 

n^oo 

Proof We start with / = \k, where K C aio) is compact. Then K = n„t/„ for cer¬ 
tain open sets t/„ (this is true for any second countable space), and taking “Urysohn” 
functions /„ for each t/„ (i.e., /„ G Cc(t/„), 0 < /„(x) < 1 forx G cj(a), and fn{x) = 1 
for X G K), we obviously have f„ -G- \k- Next, if U GL O’(fl) is open, we have 
U = \JnKn for suitable compact Kn (since R and hence a {a) is (7-compact), so 
\k„ —If/. This also gives Ic for closed sets C = G{a)\U, since Ic = — Ij/. 

Using the so-called Borel hierarchy, it can be shown that any Borel set A C (7(a) can 
be constructed from open and closed sets in at most a countable number of steps, 
at each of which a countable union or intersection of sets from the previous steps 
is used. This gives 1 a for any Borel set, and hence also yields the simple functions 
s = K/t^AjlAj- with Ck > 0. For arbitrary measurable / > 0 (not necessarily bounded 
and not even necessarily finite) it is a standard result in measure theory that there is 
a sequence (s„) of simple functions such that /: to wit, define 

A„a = {xG (7(a) I < fix) < 2-”(k+ 1)}; 

A„ = {x G (7(a) I n < fix) < “o}; 

2"«-l 

7„=n-lA„+2-« Y. 

k=l 

Relabeling the (at most) countable number of sequences thus obtained as a single 
sequence then gives a positive sequence (/r„) in C(c7(a)) such that /r„ —>■ /pointwise. 

A final trick turns (/r„) into a monotone increasing bounded sequence (/„): for 
m> n, define ^ = tnin{h„,h^}, which is monotone c/ecreasing in m and posi¬ 
tive, and hence has a (pointwise) limit /„ = limm^oo/«,m- The ensuing sequence (/„) 
is monotone increasing and still converges to /. If / is bounded (as we assume by 
definition of ,^(c 7 (a))), then (/„) must also be bounded eventually. □ 


(B.310) 

(B.311) 

(B.312) 
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If / G 3§{(j{a)) and fn/'f with /„ G C((j(fl)), we would like to define f{a) as 
lim„/„(fl), just as in the case where / G C((7(a)) and /„ G P{G{a)). However, in 
the former case convergence fn^ f is merely pointwise, whereas in the latter case 
it was uniform, translated into norm convergence /„(a) — f{a). Pointwise conver¬ 
gence of functions, then, becomes strong convergence of operators: 

Proposition B.98. If (an) is a sequence of positive operators on H for which 

0 < fli <"-<an< a«+i < • • • < c\h, (B.313) 

where a,- < Uj means that {\j/,ai\j/) < {xff ,aj^f) for each \j/ G H, then there exists a 
unique positive operator a such that an a strongly, i.e., for each Xf/ G H, 

ax^f = lim a„xif. (B.314) 

too 

Furthermore, a = sup„ a„ with respect to the partial ordering < on the set of positive 
bounded operators (that is, a„ < a for each n, and if an < bfor each n, then a <b). 

Proof Recalling Proposition A.4, define a sequence of bounded quadratic forms 
H —>■ Mby 2„(v/) = (v/,a„v^). Then (Qn(xif)) is a monotone increasing bounded 
sequence for each Xj/ G H, so that Q(xi/) = lim„^^Q„(xj/) exists. Like each Qn, also 
Q satisfies (A. 8 ) - (A.9). Since \Qn{w)\ ^ hence \Q{xj/)\ < cllv/jp, it 

remains bounded. Hence (A. 10) defines a bounded hermitian form B, upon which 
Proposition B.79 yields a bounded operator a, satisfying B{(p, xj/) = {(p^axj/). Since 

{xi/,axj/) = lim(v/,fl„V^), (B.315) 

we have a > 0. To prove (B.314), note that (B.315) gives (y/, {a — an)'^f) -G 0, but 
(B.313) implies a — «„ > 0, so that a —an has a self-adjoint square root ^fa — a„, 
defined by Theorem B.94 (see also Proposition B.99 below). Hence 

{xi/,{a — an)xi/) = {y/a — a„xif,^fa — a„xir) = Hv^a —a„i/|p -G 0. (B.316) 

Now if a sequence of operators (bn) is such that \\bn\\ < C for all n, and H^nV^II —5" 0, 
then also —>■ 0, for < ||l7n||||l7„V^|| < C||fo„i/|| —^ 0. This applies here, 

since < a„ for m < n, and hence a —< a —am, from which ||fl —a„|| < ||fl —fl„|| 
(see Lemma B.96). Fixing m, this gives \\a — a„\\ <C with C = \\a — am\\, for all 
n>m. So (B.316) implies ||(a — an)xj/\\ —>■ 0, which is (B.314). 

As to the final claim, eq. (B.315) is the same as {xi/,axj/) = sup„{(v/,a«v/)}. □ 

In this proof, we used the following generalization of Proposition A.22: 

Proposition B.99. The following conditions on a G B{H) are equivalent: 

1. > Ofor arbitrary y/ G H; 

2. a* = a and (y(a) C 

3. a = c^ for some bounded self-adjoint operator c G B[H); 

4. a = b*bfor some bounded operator b G B{H). 
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Proof. The proof is the same as in the finite-dimensional case, except that: 

• In 1 2 we use (B.303) to exclude the possibility that some A < 0 lies in O’ (a); 

• In 2 3 we need Theorem B.94) to define the square root c = ^/a from the 

function : a{a) —> K (which is well defined because a{a) C R+). By (B.290) 
with g = f = v^, we then have yja^ja = a. □ 

Given some positive / G i^(o(a)), we now use Lemma B.97 to find a mono¬ 
tone increasing bounded sequence (/„) in C(o(fl)) such that fn/^f pointwise, and 
subsequently use Proposition B.98 to define f{a) as the strong limit 

f {a)\l/= lim f„{a)\l/{xj/&H). (B.317) 

n—foo 

Arbitrary functions / are then dealt with using (B.30) and performing the above con¬ 
structing term-wise. This, then, yields f{a) for any a* = aG B{H) and / G PS{G{a)). 

It is natural to ask which corner of B{H) the operators f{a) land in when / G 
^(o’(a)), much as we have shown that /(a) G C*{a) for / G C(o’(fl)). A safe choice 
would be C*(a)^, i.e., the strong closure of C*(a), which by definition contains all 
limits of all strongly convergent nets in C* (a) (so that it certainly contains all limits 
(B.317)), and which is automatically a strongly closed unital *-algebra. This may 
seem too large, but if N is separable, it turns out to be the right choice, because 
these more general limits add nothing to (B.317)). For a more explicit description 
of C* (a)^ we need the commutant S' of any S C B{H), which is defined by 

S'= {a£B{H)\ab^ba\ib€Sy, (B.318) 

the bicommutant of S is S" = {S')'. If S* = S, in that a G 5 iff a* G S, then S' is easily 
seen to be a unital *-algebra within B{H). Furthermore, it is obvious that S C S", so 
that the passage S S" is some sort of a closure operation within B(//), comparable 
to the closure operation S 5^^ within H itself. Indeed, there is a striking analogue 
of (B.204) at the operator level, due to von Neumann (see Theorem C.127): 

Theorem B.IOO. If A is a unital *-algebra in B{H), then 

A"=A-, (B.319) 

where A^ is the strong closure of A in B{H) (which is automatically a *-algebra). 

Corollary B.lOl. Denoting the strong closure C*{a)^ of C*{a) by W*{a), we have 

W*{a)=C*{a)". (B.320) 

Though not obvious from (B.320), the alternative description through (B.319) shows 
that W* (a) inherits the commutativity of C* (a); in fact W* (a) is a commutative C*- 
algebra, too. Moreover, by construction it is also a von Neumann algebra in that 
W*{a)" = W*{a), cf. Appendix C. Such unital *-algebras in B{H) are not merely 
norm closed, but are also closed in at least three other natural topologies on B{H), 
including the strong one. The situation may be summarized in the spectral theorem: 
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Theorem B.102. Let a* = a € B{H). The isomorphism C{(j{a)) —>■ C*{a) of Theo¬ 
rem B.94 has a unique extension to a homomorphism 

^{a{a)) W*{a), a ^ f{a), (B.321) 

for (B.289) - (B.291) continue to hold. In particular, the operator eA in (B.305) is a 
projection. Also, eg. (B.304) remains valid, and for each f G ,^(c7(fl)), one has 

ll/(«)ll<ll/l|oo. (B.322) 

Proof. The map a i— f{a) is given by (B.317) and preceding discussion. Eqs. 
(B.289) and (B.291) easily follows by limiting arguments. Using the same trick 
as in the proof of Proposition B.98 it can be shown that /(a)^ = f{a^), whence, 
using the identity fg = \{{f -\- gY ~ f^ ~ ^9- (B-290) follows. This implies 

e\ = l^(a) = lA(a) = eA, whilst (B.291) gives e\ = l^(a) = lA(a) = eA. 

We prove (B.322) for / > 0; this implies the general case by (B.30) and the 
triangle equality. Writing Hi for the set of unit vectors in H, approximating /„ /, 

repeatedly using (B.300), the property f{a) = sup„/„(a) established at the end of 
Proposition B.98, and finally using (B.292) for each /„ G C((7(fl)), we may estimate: 

||/(a)|| = sup {|(r,/(fl)r)|} 

xiieHi 

= sup sup{|(v/,/„(fl)r)|} 

Y&Hi neN 

= sup sup {|(r,/„(a)v/)|} 

neN l/G/Zi 

= sup||/„(a)|| =sup||/„|l„ 
nefi nefi 

< ll/lloo, (B.323) 

where the last inequality is a trivial consequence of the specific limit f„ /. 

Finally, our motivating identity (B.304) follows from the same equality for each 
fn G C((7(fl)), upon which Lebesgue’s Monotone Convergence Theorem yields the 
right-hand side, whereas (B.315) gives the left-hand side. □ 

Of course, in finite dimension. Theorem B.102 coincides with Theorems A. 15 
and Theorem B.94. Theorem A.15 implies Theorem A.10 through (A.58) - (A.59), 
and, as we will now explain, in infinite dimension Theorem B.102 similarly implies 
a certain approximate version of Theorem A. 10, namely Corollary B.104. 

Lemma B.103. IfK C K is compact, any f G C{K) may be uniformly approximated 
by simple functions. More precisely, for each e > 0 there is a decomposition K = 
\_fi=\ M ofK as a disjoint union ofn<°° Borel sets Aj, such that for any xi G A,, 

/-f/(x,)lA, <e. (B.324) 

'=1 
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Proof. Since K is compact, / is actually uniformly continuous on K. This means 
that for e > 0 there is 5 > 0 such that \f{x) — f{y) \ < £ whenever \x — y\ < 5. Since 
(B.324) just states that |/(x) — /(x,) | < e for each i = and each x G Ai, any 

partition for which 0 < |A,| < 5 will do (where |A| = sup{|x —y|,x,y G A}). □ 

From (B.305), Lemma B.103, and Theorem B.102, we then immediately have: 

Corollary B.104. Let a* = a G B{H). For any f G C{cy{a)) and any e > 0, there is 
a partition (7{a) = U?=i^i of (7(a) as a disjoint union of n < °o Borel sets Ai, such 
that for arbitrary Xi G Ai, one has 


fi.a)-Y^f{Xi)eAi 

i=\ 


< £. 


In particular, for f(x) = x and f(x) = 1 we have 


n 

a - ^ XieAi 

i=i 

!=1 


< e; 


< e. 


(B.325) 


(B.326) 

(B.327) 


If a has discrete spectrum (7{a) — <jp{a)), then (B.326) - (B.327) reduce to (A.37) - 
(A.38), where ex is defined by (B.307), and the sums converge in norm. 

Hence in this version of the spectral theorem, one approximates a by linear combina¬ 
tions of projections in a way that reflects the approximation of the identity function 
xi-^ X on a{a) by simple functions. Eq. (B.326) is often symbolically written as 

a= f dexX, (B.328) 

JG{a) 

which may also be given some direct meaning as an operator-valued Stieltjes inte¬ 
gral, but even so, this neat expression eventually boils down to (B.326) itself. 

Corollary B.105. Let = {e G A \ e^ = e* = e}, where A is a von Neumann 

algebra. Then A is the norm-closure of the linear span of IP [A), and 

A = 3^[A)". (B.329) 


Proof. The first claim follows from Corollary B.104. This implies (B.329), which 
may also be proved directly: since IP{A) C A, the inclusion IP{A)” CA”=A is ob¬ 
vious. Conversely, let a G A and assume a* = a (if not, decompose a = a' + ia" with 
a' and a" self-adjoint). Then W* (a) C A, so that A contains all spectral projections 
of a, cf. Theorem B.102. Moreover, by Corollary B.104, a lies in the norm-closure 
of the linear span of IP (A), which by Theorem B. 100 in turn is contained in A". □ 
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Compared with Theorem B.94, it seems a weakness of Theorem B.102 that the map 
/ I— 1 - f{a) fails to be an isomorphism from ^i§(a{a)) to W*{a). The reason is that 
although the map is surjective (at least when H is separable), it fails to be injective; 
for real-valued / one has f{a) = 0 iff (v/,/(fl)v/) = 0 for all G H, which by 
(B.304) is the case iff dll^tf = 0 for all unit vectors xj/ G H, which in turn is 
the case iff / = 0 a.e. with respect to in other words, iff / = 0 in LT{a(a), . 

Thus the right kind of algebra to be isomorphic to W*{a) is U°{G{a),pL) rather 
than .^(o’(fl)), where jx is some (probability) measure on a(a) such that fl(A) = 0 
iff = 0 for all unit vectors xj/ G H. Indeed, in that case, since by construction 

Lr{a{a),ii) ^ .SS{a{a))/{f \ f = O^i-a.e.} = .SS{a{a))/ksv{f h> /(a)), (B.330) 

our map !^{a{a)) -G IT* (a) descends to an isomorphism of von Neumann algebras: 

L~((7(a),jU) 4lT*(fl). (B.331) 

This is quite nontrivial; let us first present a case study where everything is clear. 

Proposition B.106. Let H = L^(0,1) = T^([0,1]) (with Lebesque measure), and let 
a = my G B{H) (where id(x) = x) be the self-adjoint position operator 

axi/(x) = xxi/(x). (B.332) 

Then the map f i— f(a) in both Theorems B.94 and B.102 is given by 

f(a)=mf, (B.333) 

cf. Proposition B.73. The two *-algebras in B(H) defined by a are given by 

C*(fl) = C([0,1]); (B.334) 

lT*(a) =L‘”(0,1), (B.335) 

both realized as multplication operators (i.e., identifying f with mp). Furthermore, 

L~(0,1)'=L‘”(0,1). (B.336) 

More generally, let K gM. be compact, let p be a regular probability measure on K 
with support K, take H = L?(K,p) and the define a as in (B.332). Then: 

a(a)=K; (B.337) 

C*(a)=C(K); (B.338) 

W*(a)=L^{K,p); (B.339) 

f(a)=mf, (B.340) 

L’-{K,p)' =L-{K,p). (B.341) 
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Proof. We just prove the case /T = [0,1] with dll{x)= c/x; the general case is similar. 

Eq. (B.333) is obvious for polynomials /, and otherwise follows from easy lim¬ 
iting arguments. Consequently, eq. (B.334) is an instance of Theorem B.94. Every¬ 
thing else then follows if we can prove that 

C([0,1])'=L~(0,1). (B.342) 

Namely, assuming (B.342), since C([0,1]) C /.“(0,1) (and A <Z B implies B' C 
A'), we automatically have L“’(0,1)' C C([0,1])', so (B.342) implies L‘”(0,1)' C 
L“(0,1), and since the converse inclusion is trivial from commutativity of L°°(0,1), 
eq. (B.342) implies (B.336). Eurthermore, since W*{a) = C([0,1])", taking the corn- 
mutant of (B.342) and applying (B.336) yields (B.335). 

So let us prove (B.342). The inclusion L°°(0,1) C C([0,1])' is obvious, since 
nifnig = nifg = nigf = nigtrif, so we need to prove the converse. Take b G C([0,1])' 
and define / = ^l[o,i] € Lf(p, 1). Eor y/ G C([0,1]) C T^(0,1), we have 

b\j/ ^bnitfrlpij =my,bl[oi] = m^/= m,^m/l[o_i] =m/m^l[ 04 ] =mf\jf, (B.343) 

so b = nif on the dense domain C([0,1]) C L?{0, 1), with / G T^(0,1). Now b is 
bounded by definition of the commutant C([0,1])' and hence ||m^|| < 0 °. If / ^ 
L“(0,1), the proof of Proposition B.73 gives that Xt has positive measure for each 
f > 0, whence ||m/|| > t for all t, which is a contradiction. Hence / G L°°{Q, 1), in 
which case mf extends to all of L^{0, 1) by continuity. This extension must equal b, 
so that b — nif, and hence C([0,1])' C L“(0,1). □ 

The following variation on this example turns out to be qualitatively different: 

Proposition B.107. Realizing f°°(N) as multiplication operators on one has 

r(N)' = r(N). (B.344) 

Proof. Eor each G N, we define a finite-dimensional subspace (f{N) C f^(N) by 

f-{N) = {yf G f^(N) I v^(x) = OVx > A^}, 

with ensuing projection Ijv : f^(N) —>■ i.e., lArV^(x) = yf{x) foix<N and 

= 0 for X > N.lf b G we have b : f'{N) —>■ f'{N), because 1a^ G 

f“(N) (and hence \j/ G i.e., If^yf — \j/, implies bxj/ G £^{N), i.e., Iat^V^ = by/). 

With /at : N —>■ C given by fy = bl^, define / : N —>■ C by /(x) = /Ar(x) for any 
N > X ', this is well defined, in that if x < N < M, then /a:(x) = fuix)- Eor any N 
and y/ G as in (B.343) we have by/ = mfXj/, which therefore holds on a dense 

subspace UNi^iN) of f^(N). Again as in the previous proof, this gives 

ll/IU = lh/ll = ll^ll<-, (B.345) 

i.e., / G f°°(N). Thus b = mf = f G f‘”(N), whence C f“’(N). With the trivial 

opposite inclusion, this gives (B.344). □ 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



B.16 Abelian ‘-algebras in B{H) 


595 


Note that since a possible (discrete) position operator (B.332) would be unbounded 
on £^(N), a possible counterpart to (B.335), although it exists, would blast the frame¬ 
work of the this section (cf. §B.21). See, however, the proof of Theorem B.l 18. 
More generally, we have; 

Proposition B.108. Let {X,L,ii) be a o-finite Borel space and realize L°°(X,IJ.) as 
multplication operators on L^(2f,/r). Then 

L°°{X,ii)' =L'^{X,ii). (B.346) 

Proof. Writing X = with /r {X[^) < o°, which holds by virtue of O’ -finiteness, 

the proof is practically the same as for 2f = N (except for the fact that L^{Xn) C 
L^{X) need not be finite-dimensional, but it is closed, which suffices). □ 

If A C B{H) is a commutative ‘-algebra, we say that A is maximal (abelian) if 
A C B c B{H) for some commutative ‘-algebra B implies B=A. Any ‘-algebra A C 
B{H) is abelian iff A C A^ (this is trivial), and is maximally abelian iff A^ = A. To see 
the nontrivial “=l>” direction, for any subsets C C B{H) and D C B{H) the inclusion 
C D implies D' C C' (as is immediate from the definition of the commutant), so 
B' C A'. Since B is commutative, we also know that B C B', whence B C A'. If A' = A 
this gives B C A, so B — A. The condition A' = A, in turn, implies A" = A, i.e., any 
maximal abelian ‘-algebra A in B{H) is automatically a von Neumann algebra. 

Corollary B.109. In the setting of Proposition B. 108, L°°(X,IJ.) is a maximal abelian 
*-algebra in B{L?'{X,p ) ), and hence a von Neumann algebra. In particular: 

• L°°{0, 1) is a maximal abelian *-algebra in B(L^(0,1)); 

• (“"(N) is a maximal abelian *-algebra in B(f^(N)). 

The above examples suggest a neat reformulation of the spectral theorem. This 
requires a few more concepts from the theory of operator algebras, cf. Appendix C. 

Definition B.llO. For any *-algebra A C B{H) and \j/ € H, we write A\j/^ C Hfor 
the closure of the linear subspace of all vectors a\j/, a GA. We say that y/ (f^O) is: 

• cyclic/or A ifAyr^ = H; 

• separating/or A ifayr = Ofor a GA implies a = Q. 

If a* = a G B{H), we similarly say that ij/ is cyclic (separating) for a if \j/ is cyclic 
(separating) for A = C*(a), or, equivalently, for A = W*{a). 

The equivalence of the two ways of writing the last definition follows from the 
relation W*(a)yf^ = C*(a)yf^, cf. Corollary B.lOl; more generally, yr is cyclic 
(separating) for A iff it is cyclic (separating) for its strong closure A^. 

For example, if A = B(H), any vector is cyclic for A, and none is separating. 
On the other hand, if A = C • \h, then no vector is cyclic for A and all vectors are 
separating. \f H — L^{X,pL) on some finite measure space, then y/ = lx is cyclic 
as well as separating for A = L°°{X,ii). Noting (B.346), as well as the property 
B{H)' = C • 1//, these examples illustrates a general phenomenon; 
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Lemma B.lll. If 1 H & A, a vector \j/ is cyclic for A iff it is separating for A', and 
vice versa. In particular, if A' = A, then if/ is cyclic for A iff it is separating for A. 

If A is abelian, then every vector that is cyclic for A is also separating for A. 

Proof If = H and = 0 for b € A', then baiff — 0 for each a GA and hence 
b vanishes on a dense subspace of N. Since b is bounded, b — Q. Conversely, let e be 
the projection onto Ay/^; then e G A' and hence In — e G A'. Since 1// G A we have 
1^ G A\j/^ and hence e\j/ — \j/, whence (1// — e) y/ = 0. If is separating for A', this 
implies e = 1 h and hence A = H. Finally, A is abelian iff A C A'. □ 

Theorem B.112. Let a* = a G B{H), and suppose some unit vector \j/ G H is 
cyclic for a. Then a is unitarily equivalent to the position operator (B.332) on 
where the probability measure Py/ on cy{a) is given by (B.304). Fur¬ 
thermore, through the unitary operator u : H ^ L^{G{a),Pxi/) in question we have 


uf{a)u-^ = /; (B.347) 

uC*{a)u-^ =C{a{a)); (B.348) 

uW*{a)u^^ = Lr{a{a),pxif), (B.349) 

all of which being realized as multiplication operators on L^((7(a),/ry/). 

Moreover, L°°{(7(a), p^/) is maximally abelian, and hence satisfies 

L°°{a{a),p^)=L’^{a{a),p^)'. (B.350) 

Proof First, define m on a dense subspace of H by 

u:C*{a)\p ^ L^{a{a),p^)-, (B.351) 

ufia)w = f, fGCiaia)). (B.352) 


It follows from (B.289) - (B.291) and (B.304) that ||/(a)v/||// = ||/|| 2 , which makes 
u well defined (since f{a)\j/ — g{a)\l/ implies / = g), as well as isometric. In par¬ 
ticular, u is bounded, and hence it can be extended from C* {a)\j/ to H by continuity. 
This extension is surjective, since C{(j{a)) is dense in L?'{a{a),Pxf,), and there¬ 
fore u : H —7 L?{a{a),pii/) is unitary. Then (B.347) - (B.348) hold by construc¬ 
tion; the special case / = id yields (B.332). As in Proposition B.106, we obtain 
C((7(a))' = L~((7(fl),Alv/), which implies (B.349) - (B.350). □ 

Note that this proposition implies that H is separable. When does a self-adjoint 
(or normal) operator a have a cyclic vector? To practice, we first look at H = C". 

Proposition B.113. Let // = C" and let a = diag(Ai,... , An) be a diagonal matrix. 
Then the following properties are equivalent: 

1. All Xi are distinct, i.e., |<7(a)| = n (in words, a is non-degenerate); 

2. The operator a has a cyclic vector; 

3. C*{a)' =C* (a); 

4. C*{a) is a maximal abelian C*-subalgebra ofB{H). 
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Proof. We first show that all A, are distinct iff 

C*(a)=D„(C), (B.353) 

i.e., the set of all diagonal matrices. To see this, first note that for any / ; a{a) C 
(and any such function is continuous, since a {a) is a finite subset of C) we have 

/(diag(Ai,. ..,Xn))^ diag(/(Ai),...,/(A„)); (B.354) 

this is true by computation for polynomials in a, and these exhaust all functions on 
O’(a). It follows that C*{a) C D„{C). We know from (A.49) that C*(a) = C{<j{a)) 
whether or not a{a) is non-degenerate, and since dim(C((7(a))) = |<7(fl)| (i.e., the 
number of elements of G{a)), we obtain 

dim(C*(a)) = |(7(a)|. (B.355) 

So if a is non-degenerate, noting that dim(D„(C)) = n we must have (B.353). 
If, on the other hand, a is degenerate, we have |<7(a)| = m < n, so that also 
dim(C((7(fl))) = m < n and C*{a) C D„(C) is a strict inclusion. Furthermore, by 
direct computation or as a special case of Proposition B.108, we have 

D„(C)'=D„(C). (B.356) 

To prove 1 —2, take the cyclic vector to be 

V/ = (l,...,l)/v^; (B.357) 

indeed, any vector (zi,...,Zn) is equal to • diag(zi,... and we have 

diag(zi,.. ■,Zn) G Dn{C) = C*{a) by (B.353). For 2 —>• 1, if // has a cyclic vector y/ 
for a, then by definition C*{a)'^f = C", so that dim(C*(a)v/') = n. But also 

dim(C*(fl)v/)<dim(C*(fl)), (B.358) 

whether or not \j/ is cyclic for a. If \j/ is cyclic this gives 

n<dim(C*(a))<n (B.359) 

by (B.355), so that dim(C*(a)) = n, whence |c7(fl)| = n by (B.355). 

Given this, the implication 1 —3 follows from (B.356), whilst 3 —4 follows 
from Theorem A.21. Finally, we prove 4 —1: we already know that C*{a) C D„{C), 
and by (B.356) and the above argument it follows that (C) is maximal. So if C* (a) 
is maximal, then C*{a) — D„(C), and we already know from the first stage of the 
proof that this is equivalent to a being non-degenerate. □ 

With slightly more effort, an analogous result holds for general Hilbert spaces. 

Proposition B.114. A self-adjoint operator a on a separable Hilbert space H has a 
cyclic vector iffW*{a) is maximal abelian (i.e., W*{ay = W*(a)). 
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In Other words, a has a cyclic vector iff C* {a)' = C* {a)”, cf. (B.320). As we have just 
seen, if dim(//) < this is the case iff a is non-degenerate. Consistent with (B.349) 
(with M = 1) and (B.350), the position operator (B.332) acting on the Hilbert space 
is maximal in this sense, with \j/ = la(a) ^ cyclic unit vector. 

Proof. If \j/ is cyclic for a, then (B.349) and (B.350) (along with the self-evident 
property uA'u^^ = (mAm^')') yield W*{a)' = W*{a). Conversely, for any *-algebra 
A C B{H), one can find unit vectors (y/,) such that H = ©,//, with //, =A'^ff: start 
with any xj/i, then take any 1//2 G (Aij/f)-'- (in case this is nonzero, otherwise one 
was already done), etc. To show that this procedure terminates, Zorn’s Lemma must 
be invoked (take the collection of all sets (//, ) of mutually orthogonal A-stable sub¬ 
spaces //, C f/ that contain a cyclic vector for A). Then \j/ = is clearly 

separating for A. If A' = A, then \j/ is also cyclic for A; cf. Lemma B. 111. □ 

Thus we call a self-adjoint operator a € B{H) maximal if it has a cyclic vector. 

Corollary B.115. A maximal self-adjoint operator a G B{H) is unitarily equivalent 
to the position operator (B.332) on L^((7(a),/f), where p is an appropriate proba¬ 
bility measure on the spectrum <j(a) C K. Moreover, the map tj§{(7{a)) -G IT* (a) in 
(B.321) induces an isomorphism (B.331) of von Neumann algebras. 

Proof. Take p = cf. (B.304), where \j/ is cyclic (or, equivalently, separating) for 
a. The map / 1 —^ /(a) from tj§{a{a)) to IT*(a) described in Theorem B.102 can be 
propelled further by conjugation with the unitary u of Theorem B. 112, that is. 


f f{a) i-G uf{a)u ^ =mf, (B.360) 

^((7(fl)) ^ B(PI) B{L^{a{a),PY)), (B.361) 

where the final equality in (B.360) follows from the computation 

uf{a)u^^g = uf{a)g{a)\l/ = u{f ■ g){a)\l/= fg = myg, (B.362) 

where for simplicity g G C{(j{a)) C Lf{a{a),p^t), the inclusion being dense. The 
claim then immediately follows from (B.349). □ 

If a is not maximal, we can still prove a weaker version of Theorem B. 112, which 
is sometimes seen as the ultimate version of the spectral theorem. To justify this 


view, take H = C" and let a G Mn{C) be self-adjoint (or, more generally, normal). 
By Theorem A. 10, H has a basis (i),) of eigenvectors of a, with al),- = This 
yields a unitary map H -G i^{n), where n = {1,2,... ,n}, defined by mb, = 5, (where 
5 ,( 7 ) = dij, as usual). It is easy to check that uau^^ = m^, where Z : n —> C is 
defined by X{i) = Z,, and mf^xj/ = Xxj/, again as usual. In other words, a is unitarily 
equivalent to a multiplication operator (whose precise nature is left unspecified). 
Conversely, each multiplication operator m/ on some Lf{X,p) is normal, and is 
self-adjoint if the function / G L°°{X,p) is real-valued (/r-almost everywhere). 

Theorem B.116. Any bounded self-adjoint (more generally, normal) operator on a 
separable Hilbert space is unitarily equivalent to a multiplication operator. 
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Proof. As in the proof of Theorem B.l 14, decompose H = (BieiHi, where each Hi 
contains some take some separating vector y/, for a. Applying the proof of Theorem 
B.l 12 to each Hi then yields unitary isomorphisms with /r, = 

from which, taking direct sums, we obtain a further unitary isomorphism 

i/-0L2((j(a),Al,). (B.363) 

iei 

Now take the disjoint union X = U,g/(j(a), i.e., X = Uig/A,-, where A,- = (7(a) x {/}, 
endowed with the CJ-finite measure /r = /r, (so that if A C A is given by A = U,A, 

with A,' C A,, we have jU (A) = /r,-(A,)). This gives a second isomorphism 

0L2((j(a),Ak) = t"(A,ai), (B.364) 

i 

defined by mapping (pj G to the same function on Xj, extended to A 

by putting it zero on all other A,, i f j. This map is obviously unitary. By Theorem 
B.l 12, the isomorphism (B.363) maps the operator a to a direct sum of 

multiplication operators, upon which the second isomorphism (B.364) maps this 
direct sum to a (single) multiplication operator niq, where the function ^ : A —C is 
defined by q{x, i) = x (in which (x, i) G A,- C A, so that x G O'(a) C C). □ 

More generally, the operator/(a) on H, for some / G ^{G(a)), is first mapped to 
(Sittif., where f is the image of / in L°°{a(a), jJ-i) in the obvious way, which in turn 
is mapped to a multiplication operator mp, where f{x,i) = f{x), analogously to the 

position operator ^ = id^jj^j above. This leads to an isomorphism VT*(a) = L°°{X, jj.), 
which, by the same reasoning as in the proof of Corollary B.l 15, also induces an 
isomorphism (B.331) of von Neumann algebras. See also Theorem C.140. 

Finally, proposition B.l 14 may be generalized, to which end (and also as a result 
of independent interest) we extend Corollary A.20 to the infinite-dimensional case: 

Theorem B.l 17. Let H be separable and let A C B{H) be an abelian von Neumann 
algebra. Then A = W* (a) for some self-adjoint a G B{H), i.e., A is singly generated. 

Proof. Let ^P{A) be the set of all projections in A, and let y/ G // be separating for 
A and hence cyclic for A' (cf. Lemma B.l 11 and the proof of Proposition B.l 14). 
The ensuing subset ^{A)\i/ = {exj/ \ e G i^(A)} may be uncountable, but since 
any subspace of a separable metric space is separable, there is a countable subset 
.^n(A) = {e„,n G N} of £P(A) such that i^f^(A)i//is dense in tf^(A)\j/, i.e., for any 
e G £^(A) there is a subsequence in ^j^{A) such that limi^^ooen^xj/ = eXj/. But 
since ^{A) C A C A' and A'xj/^ — H, this is true not only on xj/ but on a dense set 
of vectors ay/, a G A', so that e in the strong operator topology. Thus ^f^(A) 

is strongly dense in 3^^{A), and by (B.329) and Theorem B.lOO we have 

^j^{A)"=A. (B.365) 

The self-adjoint operator that does the job is now given by von Neumann’s formula 
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a^’£3-'‘(2e„-lH). (B.366) 

n 

To see this, let C*{e„,n G N) = C*(e„)„ be the C*-algebra generated by the projec¬ 
tions e„, so that by construction 


= C*{en)l (B.367) 

We will show that 

C*{a)=C*{en)n, (B.368) 

which combined with (B.320), (B.365) and (B.367) yields the desired conclusion: 

A = = C*{en)n= c* {a)" =W* {a). (B.369) 

The simplest argument for (B.368) uses the Gelfand isomorphism 

C*{en)n=C{X) (B.370) 

as commutative C*-algebras, cf. Theorem C.8, where the set of characters 

x = {x-. C*{en)n c I x(bc) = x(b)x(c),x(lH) = 1} (B.371) 

of C* (e„)„ is equipped with the weakest topology that makes all maps 

b : X^C; (B.372) 

Hx) =x{b), bGC*{e„)n, (B.373) 


continuous. This makes X a compact Hausdorff space, and the isomorphism (B.370) 
is given by the Gelfand transform b^b. Defining s„ = 2e„ — In, we have ||s„|| = 1, 
since s„v/= y/ if y/ G e„H and «„!//= —y/ if i// G (1// — en)H = {e„H)-^. The series 
(B.366) therefore converges absolutely in B(H), and hence converges, to some limit 
a€C* {e„)n- We claim that its Gelfand transform a gC(X} separates points of X, so 
that by the Stone-Weierstrass Theorem B.51, the *-algebra it generates is dense in 
C{X) (in its canonical sup-norm). Thus a likewise generates C*(e„)„, and the proof 
of Theorem is ready up to the proof of the above claim, which we now give. 

First, note that since by definition C*(e„)„ is generated by the projections so 
that by (B.371) (and the automatic continuity this implies, i.e., x G C*(e„)*), each 
X G 2f is determined by its values on all e„. Therefore, for each pair Xi,Xj GX, j, 
there must be some n G N for which Xi{en) Xj{e„). Consequently, for each i ^ j, 
the set Nij = {n G N | x,(e„) ^ is not empty; let tiij = minA^,y. Since for any 

projection e the corresponding function e can only take the values 0 or 1, each s„ 
must take the values ±1, so that, with a = Y,n 3 '’■sVi, we have 

i(d(x,-)-fl(xy)) = ±3^”--'+ ^ ±3^ VO, (B.374) 

neNij,n>nij 

since whatever the signs, the sum is always smaller than the first term. □ 
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We now prove the following classification of maximal abelian ’-algebras in B{H), 
which forms the basis of the Kadison-Singer Conjecture discussed in §2.6 and §4.3. 

Theorem B.118. If H is separable (and infinite-dimensional), and A C B(H) is a 
maximal abelian *-algebra, then A is unitarily equivalent to one of the following: 

1. L°°{0, 1) C B{L^{0, 1)) (realized as multiplication operators); 

2. r(N) CB(f2(N)) fidemj; 

3. L‘”(0,l)©r(N) CB(L2(0,l)©f2(N)) (idem;; 

4. 1) ©D„(C) C B(L^{0,1)(BC'), for some n G N fidemj, 

and these possibilities are (mutually) unitarily inequivalent. 

The first claim means that there is a unitary operator u from H to, say, L^(0,1), such 
that the map a i—>■ uau^^ from B{H) to B(Lf{0, 1)) restricts to uAu^^ = L°°{0, 1), so 
that A = L°°{Q, 1) as both C*-algebras and von Neumann algebras (and likewise for 
the other possibilities). The last claim, then, means that there is no unitary map from, 
say, L^(0,1) to f^(N) that similarly induces an isomorphism L‘”(0,1) = f°°(N). 

Proof We begin with the easy part, which is the last clause. The key notion to 
proving the claimed inequivalence is that of an atomic projection in a von Neumann 
algebra M C B{H). If we partially order projections on H by (cf. Theorem 2.50 and 
§C.21) 

e< f iff eH CfH, (B.375) 

we say that / is atomic if / ^ 0, and 0 < e < f implies either e = 0 or e = /. This 
property is preserved under unitary equivalence: if M C B{H) and N C B(H') and 
N — uMu^^ for some unitary u : H —¥ H' (again in the sense that a i— uau^^ is an 
isomorphism M N), then / is atomic in M iff ufu is atomic in N. The reason is 
that a uau^^ induces an isomorphism of the pertinent posets of projections in M 
and N, so that all order-theoretical notions are preserved under unitary equivalence. 
In the case at hand, the projections are easy to classify: 

1. The nonzero projections in T°°([0,1] are the characteristic functions on measur¬ 
able subsets of [0,1] of positive Lebesgue measure. Since any such subset prop¬ 
erly contains another such subset, there are no atomic projections in T“([0,1]. 

2. The nonzero projections in f°°(N) are the characteristic functions on N, among 
which there are plenty of atomic ones, namely the one-dimensional projections 
5x, X G N. Thus ^"(N) has countably many atomic projections. Moreover, each 
other projection majorizes an atomic one. 

3. Similarly, Lr{Q, 1) © has has countably many atomic projections, as well 
as uncountably many projections that do not majorize any atomic one. 

4. Since the atomic projections D„{C) are the one-dimensional ones (given by 
diagonal matrices with n — 1 zero’s and exactly one entry equal to unity), 
^”(0,1) ©D„(C) has exactly n atomic projections, as well as uncountably many 
projections that do not majorize any atomic one (namely the ones in L°°{Q, 1)). 
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Any unitary equivalence between two of the entries in the list would have to preserve 
this fine structure of projections, and hence cannot exist. 

We now prove that the list in Theorem B.118 is exhaustive. According to The¬ 
orem B.117, we only need to look at abelian von Neumann algebras A = W*{a), 
where a is maximal. According to Theorem B.112 and its Corollary B.115 (whilst 
noting that some unitary equivalence a = b induces a unitary equivalence W* (a) = 
W* (b)), we may further restrict our attention to the case where a is the position op¬ 
erator on where K = a{a) cM. is compact and jU is a regular probability 

measure (here and in what follows, this is always meant with respect to the Borel 
structure inherited from M D K), with support equal to K, and hence 

W*{a)=L'^{K,ii) (ZB{L^{K,ii)). (B.376) 

The final step is to further reduce the possibilities by exploiting equivalences. 

Definition B.119. Two measure spaces (X,Z,IJ.) and {X'are: 

• equivalent if there is a measurable bijection (p :X ^X' with measurable inverse, 
and the measures and jJ.' on X' are equivalent in the sense that (p^,jJ.{A') = 0 

— 0 for each A' G L'. Here is the measure on {X',Z') defined by 

(p,lu(A') = ju((p-\A')) (A'ex'). (B.377) 

• isomorphic if there is a measurable bijection (p : X ^ X' with measurable in¬ 
verse, and (p^:|J.{A') = ll'{A') for each A' G X'. 

The ambiguity of the notation (p^^ in (B.377) is innocent: for general measurable 
maps (p :X ^X' the set (p^^{A') can only denote the pre-image {xGX \(p{x) G A'}, 
whereas for invertible maps one might construe (p^^{A') as {(p^^{x') \ f G A'}, 
where (p^^ is the theoretic inverse (p^^ of (p. Of course, these sets duly coincide. 

Lemma B.120. Let K and K' be compact subsets of K, with X and X' the Borel 
structures inherited from M D Ai and K D K', respectively (often omitted in what fol¬ 
lows). Let jj. and p' be probability measures on K and K', respectively, and suppose 
that the associated measure spaces {K,X,p) and (K',X',p') are isomorphic. 

Then there exists a unitary operator 

u:L^{K,p)-^L^{K,p') 

such that 

uL°°{K,p)u-^ =L’"{K’,p'). (B.378) 

Note that u does not intertwine the positions operators (B.332) on L?'{K,p) and 
L?'{K',p'). These operators have already done their job in reducing the situation to 
L?'{K,p), and from that point onwards (B.378) is exactly what we need. 

Proof. All maps appearing below are assumed Borel. The change-of-variables for¬ 
mula for a general (i.e., not necessarily invertible) map <p \ K ^ K' reads 
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f d{(ptlJ.)g= [ dixgo(p, (B.379) 

Jk' Jk 

where g : K' Under the assumption that (p is invertible, this can be rewritten as 

[ d{(ptlJ.)fo(p^^= f dpLf, (B.380) 

Jk' Jk 

where f : K —¥ C.lf (p is also an isomorphism of measure spaces, this becomes 

f dn'fo(p-^ = f duf. (B.381) 

Jk' Jk 

If (ptH and jj.' are equivalent and hence mutually absolutely continuous, the 
Radon-Nikodym derivative ft)/exists (as does its counterpart c/(/r') /djJ.), 
and using (B.137) and (B.380), one easily verifies that the operator 


u : L^{K,n)^L^{K,n'y, 

-1 


u\j/ = 


Idjcp^H) 

dfx' 


Yo(p 


is isometric. Moreover, u is unitary, because it has an inverse, given by 




u X = 


Idift 


dpi 




(B.382) 

(B.383) 

(B.384) 

(B.385) 


We give these general expressions for later use; if (p^,/j. = fi', they simplify to 

u\j/= (B.386) 

u^^X = X°(P- (B.387) 

For / G U°{K,ii) we then have (cf. Proposition B.73) 

utrifu^^ = (B.388) 

We already know that the map f ^-¥mf injects L~(fC,/r) isometrically intoB(L^(fC,/r)), 
and analogously for U°{K',pi'). Furthermore, The map f ^ f o <p^^ gives an iso¬ 
morphism L°°{K,p.) ^ LX'{K',pl'): the property 

ll/o(p-'|ir = ||/||r, (B.389) 


which yields injectivity, may be checked either from (B.240) or from the assumed 
isomorphism of measures (and hence equivalence of measures, which in fact suffices 
for this purpose), whereras invertibility of (p gives surjectivity (since g G L°°{K',p.') 
is the image of f = go(p G LX’{K,pi'). Eq. (B.378) follows. □ 
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The final step of the proof appeals to a deep and fundamental classification theorem 
in measure theory, which goes back to Kuratowski in a form that applies to general 
Polish (i.e., complete separable metric) spaces. This theorem implies: 

Lemma B.121. Let {K,L,ii) be a infinite probability space (in that infinitely many 
different elements of L have positive measure), where K gM, is compact and E is 
the <J-algebra inherited from the Borel structure on M,. Then (K,E,IJ.) is isomorphic 
to exactly one of the following possibilities (called standard measure spaces).' 

1. K = [0,1] with p equal to Lebesgue measure pi; 

2. /f = = {2^",n G N} U {1}, equipped with any probability measure p' for 

which p'{{2^"}) > 0 for each n G N and /r'({l}) = 0; 

3. K = [0,1] with p = tpi + (1 — t)p',for some 0 < f < 1; 

4. K = [0,1] with p = tpi + (1 — t)Pn,for some n G N and 0 < f < 1, 

where p„ is an arbitrary strictly nonzero probability measure on the n-point set 

l)/n,l}. (B.390) 

Here we have stated the result in terms of probability measures p on compact spaces 
K C [0,1]; this is convenient in the context of our proof. To understand the last two 
cases, for general measure spaces {X,E,p) we say that A G 2^ is an atom if for any 
Z? C A we have either p(B) —Oovp{A\B) — 0 (but not both; this implies p{A) >0, 
whence an equivalent definition of an atom as a set A G 2^ having positive measure as 
well as the property that if some measurable subset BcA has measure p{B) < B{A), 
then p{B) = 0). In our case at hand {K,p), each atom A contains a point x G K such 
that p{A) — /r({x}) and /r(A\{x}) = 0, so that modulo null sets we may identify 
each atom A with the measure-carrying point x it contains. Moreover, K can contain 
at most a countable set £/ = {x„}„ of such points x„. The formulae 

p=Pa+Pc; (B.391) 

Pa{A) = p{An£/y, (B.392) 

Pc{A)=p{A\{Ans^)), (B.393) 

then give the canonical decomposition of p into an atomic part Pa and a continuous 
part Pc- This, then, is the sense in which the last two cases of Lemma B.121 are 
meant. Note that characteristic functions 1 a on atoms AgK yield atomic projections 
in L°°{K,p), linking the two notions of atomicity that play a role in this proof. 

The first entry of this lemma yields the first entry in the list in the theorem. To 
obtain the others, we need a few more unitary equivalences. For the second, define 

u : (B.394) 

u\j/(n) = ^p'{n)\j/{2-''), (B.395) 

and u\j/{l) irrelevant. This operator is unitary and, just like in (B.378), it intertwines 

uL°°(N',p')u-^ =r(N). (B.396) 
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Note that (B.394) is a special case of (B.383)). The third and fourth cases require 
the following construction; if ^ <ZK '\s, the set of atoms in {K,E,IJ.), we decompose 


K=iK\£/)\j£/, (B.397) 

as a disjoint union. For any measure /r this induces an orthogonal decomposition 

L^iK,n) =L^{K\£/,n)®L^{£/,ny, (B.398) 

=eL^iK,n); (B.399) 

(B.400) 

where e = l^-y^ and — e = 1^ are projections. Using (B.391), this gives 

L\K\j^,lx)=L^iK,i^,y, (B.401) 

(B.402) 

so that at the end of the day we obtain 

L^{K,^)=L^{K,^,)(BL\£/,^^). (B.403) 

This in turn induces the decomposition 

=T~(^,Aic)©T“K,Ai«); (B.404) 

V°{K,iic) = eV°{K,ii) = eV°{K,ii)e-, (B.405) 

= ^)- (B.406) 

Combined with (B.396), this shows that the third entry of the lemma yields the third 
entry of the theorem. To obtain the fourth and last, we need the unitary map 

u : (B.407) 

Mi/m = V lln{fn/n)^{m/n) {m = (B.408) 

which delivers the unitary equivalence 

uLr{rl ,ix„)u^^ = DniC). (B.409) 


Short of a proof of Lemma B.121, we have (at last!) proved Theorem B.118. □ 

Thus one of the remarkable novelties of infinite-dimensional Hilbert space is that 
even in the separable case, uniqueness of maximal abelian ’-algebras is lost. 

There is a different proof of Theorem B.118 that does not rely on Kuratowski’s 
Lemma B.121, but instead is based on properties of the projection lattice 3^ {A) in A. 
In the following outline of this proof, A is a maximal abelian ’-subalgebra of B{H), 
where H is a separable Hilbert space. Hence A is a von Neumann algebra, which is 
generated by its projections. This leaves three mutually exclusive possibilities: 
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1. A has no minimal projections; 

2. A is generated by its minimal projections; 

3. A has minimal projections that do not generate A. 

The following lemma, whose proof we merely sketch, replaces Lemma B.121. 

Lemma B.122. If H is separable and A C B{H), then S^iA) contains a maximal 
totally ordered set SA (A) that generates A (as a von Neumann algebra). 

Proof. This is proved in two steps. First, 3I’{A) contains a countable subsets I^ciA) 
that generates A. Indeed, according to Lemma B.lll and Proposition B.114 (and 
maximality of A), H contains a unit vector \j/ that is both cyclic and separating for 
A. Since H is separable, ,^(A)y/ C H has a countable dense subset, which is 

The second step is trickier, namely to construct a maximal totally ordered set 
<5^(A) from This is done inductively. We number = {ei,e 2 ,...}. 

Starting from = {0/r,ei, 1//}, we now construct finite totally ordered sets of 
projections such that C lies in the linear span of Let 

= 0/r,e'i,... ,4-1,4 = 1//}, (B.410) 

where e\ <■■■ < e'y^ (where e < f means e < f and e f /), and define 

= •^nU{e' + (edi-e')e„+i,i = 0,...,r„-l}. (B.411) 

Given the total ordering in it is easy to see that each e' + (4i ~ is 

indeed a projection, and, by the same token, that meets its specification. Let 

= U„^„, (B.412) 

which remains totally ordered but typically is infinite, and take the poset IP of 
all totally ordered subsets of IP(A) that contain ordered by inclusion. Zorn’s 
Lemma then yields a maximal element of IP, and this is our IP(A): this maximal 
element is itself totally ordered, and since its linear span contains each projection 
e„ € jf^dA), the projections in IP(A) generate A (since the e„ already do so). □ 

The above trichotomy then leaves the following possibilities: 

1. Let i// € N be a unit vector that is cyclic and separating for A. Then 

a: £r{A) ^ [0,1]; (B.413) 

eH>(V/,eV^), (B.414) 

is an isomorphism of posets. It is easy to show that the linear span of the set of 
all vectors (f)v/, t G [0,1], is dense in //, and that the map 

ua^dO¥=do.O (B.415) 

extends (by linearity and continuity) to a unitary isomorphism 
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(B.416) 

which intertwines A with L°°(0,1) in the sense that 

=L~(0,1). (B.417) 

2. This case relies on a general fact about von Neumann algebras M; if e G 3^(M) 
is minimal, then pMp = C. This implies that if M = A is abelian, then for each 
fl G A one has ea = Xa for some A G C. It follows that: 

• Each minimal projection e, in 3^{A) is one-dimensional. 

• Different minimal projections are orthogonal. 

• \h (strongly), where the sum is over all minimal projections in A. 

Since H is separable, we may assume i G N, so that we obtain a countable basis 
(Vi) of N in which e,- = |t;,)(t;,j, and hence have a unitary isomorphism 

(B.418) 

Vi ^ di, (B.419) 

i.e., u is defined by linear and continuous extension of (B.419). Clearly, 

uAu-^=r{N). (B.420) 

3. The first part of the analysis in the previous item still applies, but this time, the 
sum e = over all minimal projections in A is not equal to 1//. If there are 
n G N such projections, we obtain 


(B.421) 

and otherwise 

eH^f{n). (B.422) 

We combine these in the notation 

eH'^f{K), (B.423) 

where tc = n, in which case £^{k) = C” and — Dn{C), or if = N. Further¬ 
more, we have 

(B.424) 

as in the first item. By construction, the corresponding unitary 

u:H (B.425) 

then satisfies 

uAu-' = r (fc) ©L~(0,1). (B.426) 

This finishes the alternative proof (sketch) of Theorem B.118. 
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B.18 Compact operators 

The spectral theorem (in whatever version) on infinite-dimensional Hilbert spaces 
considerably simplifies for a class of well-behaved operators called compact. 

Definition B.123. A linear map a \ V between Banach spaces V, W is called 
compact if for some (and hence all) d > 0 the image a{V<d) of the closed d-ball 

V<d = {v&V: ||v|| < d} (B.427) 

is pre-compact in W (i.e., its closure a{V<d)^ is compact), or, equivalently, if the 
image {av„) of any bounded sequence (v„) in V has a convergent subsequence. 

Before turning to Hilbert spaces, we mention two facts of general interest. 

Proposition B.124. A compact operator is bounded. 

Proof. If not, then for any n G N there is some v„ G y<i for which ||av„|| > n, so 
that (flv„) cannot possibly have a convergent subsequence. □ 

Proposition B.125. A compact operator a : V ^ W maps weakly convergent se¬ 
quences in V to norm-convergent sequences in W. 

Proof. Let (v„) be a sequence in V that weakly converges to v. It is easy to show 
that if a :V is (norm) continuous, then it maps weakly convergent sequences 
in V to weakly convergent sequences in W. Therefore, the sequence (av«) weakly 
converges to av. If (flv„) failed to converge to av in norm, then it would have a 
subsequence (av„j.) such that for some e > 0 and all sufficiently large k one had 

||av„j, —av\\ > e. (B.428) 

However, (v„), being weakly convergent, is bounded by Lemma B.126 below, and 
hence also its subsequence (v„^) must be bounded. Since a is compact, {avnf) has 
some norm-convergent subsequence, which necessarily converges to av (since we 
know this is the weak limit of the ambient sequence (flv„) and hence also of any of 
its subsequences, and if a norm-limit exists, the corresponding weak limit must be 
the same). But for large enough k this convergence flatly contradicts (B.428). □ 

Lemma B.126. A weakly convergent sequence in a Banach space is bounded. 

Proof. Since v„ — >■ v weakly, the sequence (<p(v„)) in C converges to ^(v) for each 
(p GV*, so that sup„{|^(v„)|} < °°. Using the notation (B.129), this may be rewritten 
as sup„{|v„(^)|} < oo. Using Theorem B.78 (with V V**, W = C, and X = N), 
this implies sup„{ ||v„||} < and hence sup„{|| v„||} < oo by Proposition B.44. □ 

Definition B.123 simplifies if V =W = H is a Hilbert space, since we have: 

Proposition B.127. If the image a{H<\) d H of a linear map a : H ^ H is pre¬ 
compact, then this image is in fact compact (and hence a is compact). 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



B. 18 Compact operators 


609 


For the proof, call a Banach space V reflexive if V** = V (i.e. through the canonical 
injection vi-^v, cf. Proposition B.44). Hilbert spaces H are reflexive, since H* = H 
by Theorem B.66. Proposition B. 127 then follows from yet another lemma: 

Lemma B.128. IfV is a reflexive Banach space and a : V —>■ VT is compact, then 
a (y< 1 ) is compact. 

Proof. The proof relies on a corollary of the Banach-Alaoglu Theorem B.48, ac¬ 
cording to which y<i is weakly compact if V is reflexive (indeed, by applying 
Banach-Alaoglu to V* instead of V, it follows that the unit ball in y** is compact in 
its weak*-topology; if, in addition, V is reflexive, then the inverse of the canonical 
injection V ^ y** maps the weak*-topology on V** to the weak topology on y). 

So let a : y —>■ VT be compact, and let w„ be a sequence in a(V< i), say = av„ for 
some sequence (v„) in y< i. Then since V< i is weakly compact, v„ has a weakly con¬ 
vergent subsequence v„^_ in y<i, say lim*.^„oV„j, = v weakly. By Proposition B.125, 
lim<;^ocflv„^ = av in norm. In other words, (av„) has a norm-convergent subse¬ 
quence, namely {avnf), with limit in a(y<\). Hence a{V<i) is compact. □ 

In view of Proposition B.127, we may as well take the following starting point: 

Definition B.129. IfH is a Hilbert space, a linear map a : H ^ H is called compact 
when the image a{H<\) of the closed unit ball in H is compact. 

We write Bq{H) for the set of all compact operators on H. 

Theorem B.130. The compact operators Bq{H) form a C*-algebra in B{H) in the 
operations inherited from B{H). Furthermore, Bq{H) is a two-sided ideal in B{H). 

Unfolding this theorem, the claim consists of the following parts: 

1. Bq{H) C B(H), i.e., a compact operator is automatically bounded. 

2. Bo{H) is a vector space. 

3. If a,b € Bo(H), then ab € Bo(H). 

4. If (fl„) is a convergent sequence in B(H) with limit a, i.e., ||a„ — fl|| —>-0 for some 
a € B{H), and if each a„ G Bo{H), then a G Bo{H). 

5. If fl G Bo{H), then a* G BoiH). 

6. If a G BqIh) and b G B{H), then ab G Bq{H) and ba G Bq{H). 

Proof. The first clause is Proposition B.124, and the second and sixth (which im¬ 
plies the third) are almost trivial. For the fourth, we use the following criterion for 
pre-compactness (in a metric space): K C H is pre-compact iff for each e > 0 it can 
be covered by & finite number of open e-balls Bg{Xi) = {</ G H : ||</ —Xiil < £}> 
where i= \ ,... ,m (i.e., all balls have the same radius e). Given that ||fl„ — fl|| —> 
0, for each e > 0 there is n such that ||a„ — a|| < e/2. Since a„(//<i) is compact, it 
has a finite cover with e/2-balls; in other words, for each \j/ G //<i there is an i such 
that \\a„\lf — Xi\\ < e/2. Hence, as || v/|| < 1, we may estimate 

\\aW-Xi\\ < \\{an-a)w\\ + \\anW-Xi\\ < Ikn -a||llvll + 5^ < ^e-f fe = e. 
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So a{H<i) has a finite cover with e-balls and hence is pre-compact. This finishes 
the proof from Definition B.123; from Definition B.129, invoke Proposition B.127. 

To prove the fifth clause, we need a result of independent interest. We say that a 
linear map a : H H is (or has) finite rank if its image is finite-dimensional. 

Proposition B.131. A bounded operator a G B{H) is compact ijf it is a norm-limit 
of finite-rank operators. 

Proof. Since it is easy to see that finite-rank operators are compact, the “<^=” direc¬ 
tion follows from clause 4 of Theorem B.130. The difficult direction is the opposite 
one, which we prove by contradiction (as a technical note, our proof assumes that 
H is separable, but the claim also holds in the non-separable case, in which it can be 
shown that ran(fl) is separable whenever a is compact). 

Pick a basis (Vi) of N (or, in the non-separable case, of ran(a)), and define e„ 
to be the projection onto the linear span of the first n basis vectors. Given some 
a G Bo{H), define a„ = ena. We show that ||fl„ — fl|| —>■ 0. If not, then 

3e > 0VA^3n > : ||fl„ — fl|| > e, (B.429) 

which in turn implies that for any 5 > 0 there are unit vectors for which we have 
II (a„ — a) 1/4 II >e — 5. Take 5 = e/2, whence 

3e >0VA^3n >A^: ||(a„ - a) V 4 || > e/2. (B.430) 

Now a is compact, so that, noting that V 4 G H<\, the sequence («!//„) has a conver¬ 
gent subsequence, say with limit (p. We may then write 

{an-a)\^n = {en-\H){aWn-(pP(p), (B.431) 

so that, for each )//„, 

||(fl„-a)v4|| < ||(e„-l/r)||||aV/„-^|| + ||(e«-l//)^||. (B.432) 

If we now restrict the \j/n so as to lie in the convergent subsequence in question, then 
the right-hand side vanishes asn ^ 

• Since ||e„|| = ||1//|| = 1 we have ||(e„-l//)|| <2; 

• By construction we have lim„ ||aV4 — ^|| =0; 

• For any basis of H, and any (p G H, we have lim„ II (e„- 1//)^|| = 0 (although 
Ikn ~ 1//II fails to converge to anything if H is infinite-dimensional!). 

However, this contradicts (B.430). □. 

We use the notation of this proof to establish the fifth clause of Theorem B.130. 
By the sixth, the operator a* = is compact, since any finite-rank operator such 
as e„ is compact and a* is bounded. Therefore, ||a* — a*|| = ||a„ —a|| —?> 0, so a* —>■ a* 
and hence a* G Bo{H) by clause 4. □ 
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B.19 Spectral theory for self-adjoint compact operators 

If only to establish our notation, let us begin by recalling Theorem A. 10: 

Theorem B.132. Let dim(//) < °° and let a : H ^ H be a self-adjoint operator. 
Then the eigenvalues X of a are real (collected in the point spectrum 0},(a) C Rj, 
the eigenspaces Hx corresponding to different eigenvalues X are orthogonal, and 
we have the spectral resolutions 


a= ^ X-ex\ (B.433) 

AGOp(a) 

l//= E (B.434) 

Xeap(a) 

where ex is the projection onto the eigenspace 

Hx = {y & H \ ayt = (B.435) 

This theorem is equivalent to the following alternative version: 

Theorem B.133. Let dim(//) < oa and let a : H ^ H be a self-adjoint operator (i.e., 
a* = a). Then a is diagonalizahle, in the sense that H has a basis (u,) consisting of 
eigenvectors of a. Furthermore, the eigenvalues Xi of a are real. 

If a is diagonalizahle, using the familiar notation cf. (2.7), we write 

aVi = XiVi; (B.436) 

a = Y^Xie^.. (B.437) 

iei 

To move from Theorem B.132 to Theorem B.133, pick some basis of each 

eigenspace Hx. By Proposition A.8 we then have 

Aim(Hi) 

ex= Y (B-438) 

k=l 


The totality of all where X € (yp{a) and k = 1,...,dim(//;L) is our basis: 
relabeling this set as (u,), eq. (B.434) becomes 1// = Li or yr = T.iCiWi 

with Ci = (Vi, y/) for each y/ G H, which according to Theorem B.61.1 shows that 
(u,) is a basis of H (and hence i = 1,...,dim(//)). Furthermore, (B.433) yields 
av^^^ = Xv[^\ or (B.436), so that each U; is an eigenvector of a. 

Conversely, for each X € <Tp(a), assemble all eigenvalues A, that are equal to X 
and relabel those as This yields ex through (B.438), and the above argument 
may be rerun in the opposite direction: the basis property of the (u,) implies (B.434), 
and the eigenvector property (B.436) yields (B.434) by verifying it on each basis 
vector Vi = v[^\ recalling that by construction, Xi = X. 
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We now adapt these results to infinite dimension. We still say that an operator 
a : H ^ H is diagonalizable if H has a basis (Vj) consisting of eigenvectors of a. 

Proposition B.134. Let H = £^{I) for some set I (i.e., H has a basis {Vi)iei)- Then 
some collection {Xi)ifzj of complex numbers occurs as the set of eigenvalues of some 
bounded operator a G B{H) iff {^i)isi is bounded, i.e., sup{|A, |,! G 1} <°°. 

Defining a function A ; / —>■ C by A (i) = A,-, we may express this as A G 

Proof. If a G B{H) is diagonal in some basis (i),), with eigenvalues (A,), then 

|A,-| = ||A,-b,-|| = Hat;,-11 < ||fl||||B,-|| = ||a||, (B.439) 

for each i G I, whence the eigenvalues are bounded. Conversely, if they are, so that 
||A|l.>o < °°, take a basis {Vi)iei of H, write \j/ = with Y,i \ci\^ < cf. Theorem 

B.61 and define axj/ = Li AiQB,. Since 

E|A,-c,-p < ||A|liEk<f = WWJwf < (B.440) 

i i 

we have aXj/ GHhy Lemma B.59. These estimates also prove that Hav/H < ||A||oc||i/||, 
so that a is bounded, with ||fl|| < ||A||.>o (in fact, equality holds here). □ 

This characterization of bounded diagonalizable operators by a property of their 
eigenvalues may be considerably sharpened for self-adjoint compact operators. 

Theorem B.135. Let dim(//) = oo, and let a G B(//)sa- Then a is compact iff it is 
diagonalizable with A G £o{I), in which case the sum in (B.437) converges in norm. 

We recall that some function / ; / —C is in £o{I) if for each e > 0 there is a finite 
subset 4c/ such that \f{i) \ < e for all / ^ 4. If / = N (and in fact the proof below 
will produce this labeling of the basis), then the condition A G /o(N) just means that 

lim A„ = 0. (B.441) 

n—^oo 

Before proving this, we state the infinite-dimensional analogue of Theorem B.132: 

Theorem B.136. Let dim(//) = °o and let a be some bounded self-adjoint operator. 

Then a is compact iff it has the properties stated in Theorem B.132, amended by 
the following clarifications and addenda (cf. Definition B.6, where X = (7p(a)): 

1. The sum in (B.433) converges in norm; 

2. The sum in (B.434) converges strongly, i.e., for each \j/ G H we have 

W= E (B.442) 

Xeap{a) 

3. IfX G O’(a) and X fQ, then X G C7p(a) and dim{Hx) < 

4. Always 0 G (7(a), and (7p{a) C K has 0 as its only accumulation point. 

The equivalence between Theorems B.135 and B.136 is a bit more subtle than in 
finite dimension, but the key to the proof of both is the following lemma. 
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Lemma B.137. A compact self-adjoint operator a has an eigenvalue X = ±||a||. 

Note that by definition of the operator norm, one always has |A| < ||fl||, whether or 
not a is compact, but the point about compact self-adjoint operators is firstly that 
they have an eigenvalue at all, and secondly that the above equality is saturated. 

Proof. We use the fact that the norm \j/ M- ||v/|| is continuous on H, see (B.5), so 
that it attains a maximum on the compact set a{H<\). Assume that this maximum is 
attained at ay/i, with ||vfi|| = 1- By definition of the operator norm, this maximum 
must be ||a||, so that ||fl|p = ||flV/i|p. Cauchy-Schwarz and a* = a then yield 

\\af = (av/i,av/i) = (vfi,flVi) < llri||||flVi|| < l|a^ll = \\af, (B.443) 

where we have used (C.2). In the Cauchy-Schwarz inequality (A.l) one has equality 
iff either v = 0 or w = zv for some z G C, so that we must have a^ Xj/i = zi/i, with |z| = 
||fl|p. Moreover, z G R, as eigenvalues must be real (which trivially follows from 
a* = a, one does not even need Theorem B.93 here), so a^y/i = with either 
X = ||a|| or A = —||fl||. If a^\ = Xy/i, we are ready. If not, then Xi = aWi 
in which case aXi = a^xj/i — Xay/i — X^xj/i — Xay/i = —Xxi- Q 

Corollary B.138. A compact self-adjoint operator is diagonalizable. 

Proof Using the notation of the above proof, we call the (normalized) eigenvec¬ 
tor in question Di (so either Ui = xj/i or Ui = Xi)- Note that if = 0, then 

{a(p,Vi) = {(p,a*Vi) = {(p,aVi) = ±X{(p,Vi) = 0, so that a maps the orthogonal 
complement = {^ G // | (Di , ^) = 0} of Ui into itself. This implies that a com¬ 
mutes with the projection ei onto i.e., eia = aei and hence also eia = eiaei, in 
which the right-hand side is essentially the restriction of a to = eiH. 

By Theorem B. 130.6, the operator eia is compact, like a itself, and it is also 
self-adjoint. If eia = 0 we are ready, since Ui plus any basis of eiH is a basis of 
p[ that diagonalizes a. If not, we apply Lemma B.137 to the operator e\a, finding 
an eigenvector V 2 with nonzero eigenvalue A 2 . A simple computation shows that 
eiV 2 = V 2 , so that V 2 G eiH, from which we infer, in turn, that aV 2 = X 2 V 2 . 

So we have found two basis vectors (Di,tt 2 ) of hf that are eigenvectors of a. 
The above procedure may then be iterated: we define e 2 as the projection onto the 
orthogonal complement of Ui and V 2 , and consider e 2 a. If eza = 0 we are ready; if 
not, we find a third eigenvector of eza and hence of a in e 2 hf, et cetera. 

• If // = C” is finite-dimensional, this procedure terminates after n steps, leaving 
a basis {ui,..., v„} of H that by construction consists of eigenvectors of a. 

• If // is separable, the iteration procedure may be continued countably many 
times, leading to an ordered countable set B — (di,D 2 ,...) of orthogonal unit 
vectors that are eigenvectors of a. By construction we have |Aa^| > [Aat+i | for all 
A G N, and hence there are two scenarios: either eNa = 0 for all A > Aq G N 
(with e^a 0 if A < Aq), in which case a = 0 on (ui,..., DATg)^, or all IAat] > 0. 

• In general, consider the set of all orthonormal sets in H that consist of eigenvec¬ 
tors of a. This set is nonempty by the argument above, and is inductively ordered 
by inclusion, so by Zorn’s Lemma it must have a maximal element B. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



614 


B Basic functional analysis 


By Theorem B.61.5, the set Z? is a basis of H iff = {0}. To show that this is the 
case, suppose is a nonzero Hilbert space. Define / as the projection on H with 
image B^ and consider the self-adjoint compact operator fa. If fa = 0, there is at 
least one eigenvector of a in = fH (namely, with eigenvalue zero), which is a 
contradiction. If fa f 0, then a has an eigenvector by Lemma (B.137), and again a 
contradiction has been found: for in all three cases, by construction all eigenvectors 
were already contained in B^^ = span(Bi,.. .)^. □. 

Even if H is non-separable, the image of a compact operator a must nonetheless 
be separable. Therefore, the non-zero eigenvalues of a form a countable set, and 
the eigenvalue zero (which, by the same token, must occur in the non-separable 
case) has some uncountable multiplicity (in sharp contrast to which, each nonzero 
eigenvalue has finite multiplicity). Also in the separable case, the only eigenvalue 
that may have infinite multiplicity is zero (though in the separable case it does not 
necessarily occur). Theorem B.135 is now a consequence of the following lemma: 

Lemma B.139. A diagonalizable operator a is compact ijfX G fo(7). 

Proof In view of the proof and subsequent comment above, we may as well as¬ 
sume that / = N. For any y/ G H, the sum in (B.214) converges, so we must have 
\\m„{Vn, y/) = 0, or, in other words, 0 weakly. If a G Bo{H), then aVn —0 in 
norm by Proposition B.125, and hence A„ —0, i.e., X G fo(N). Conversely, if this 
holds, then for each e > 0, the set /e = {« G N : |A„| > e} is finite. This implies that 
the operator a„ = has finite rank. Since \Xm\ < £ whenever m ^ 

\\{a„-a)y/f = \\ ^ < Y. \Xm\^\{Vm,y/)\^ < (B.444) 

m^h/n m(^hln 

where in the last step we also used (B.213). Hence an a in norm, so that a is 
compact by Proposition B. 131. □ 


To finish the proof of Theorem B.135, we show that the sum in (B.437), which for 
general bounded diagonalizable operators converges strongly, in fact converges in 
norm. To put this in perspective, eq. (B.437) with a = 1 h reads 

(B.445) 

iei 

If / is infinite, this sum cannot converge uniformly: e.g., if we take / = N, then 


lim 

N^oo 


N 

'^h-Y 


n=\ 


lim sup 

N^OO 




n=\ 


, y/ G H<i 


(B.446) 


cannot be zero, as shown by taking yr orthogonal to all Vi,...,Vn. However, by 
Theorem B.61.1 the sum does converge strongly (i.e., applied to each fixed y/). 

This seemingly special case even yields strong convergence of the sum in (B.437) 
for general diagonalizable bounded operators a, for by continuity of a we have: 
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aXj/ = aY^{Vi, \if)Vi = Y^{Vi, \i/)aVi = ^i^- (B.447) 

iei iei iei iei 

If a is compact, strong convergence of (B.437) may be strengthened to norm conver¬ 
gence. The argument is analogous to the proof of Lemma B.139, but for complete¬ 
ness and contrast we now present it for general I. Since X € fo(7), for given e > 0 
there is a finite set 4 C / for which |A,| < e for all / ^ 4. For fixed xj/ G H, we have 

2 2 

(a-Y,Xiev.)\i/ = ^ i/)!^ < e^llV^f, (B.448) 

/'^/e 

SO that IIa — Lig/e || < £• By Definition B.6, eq. (B.437) holds in norm. □ 

This analysis by no means contradicts Corollary B.104, including (B.327): ap¬ 
plied to compact operators, exactly one of the subsets A,g C (j(a) contains a{a) fl f/p, 
where t/o is some neighborhood of 0 G C7(fl), so that the corresponding projection 
ca- is infinite-dimensional and all the other ca are finite-dimensional. Thus the sum 
in (B.327) takes a rather different form from either the sum Yi^Vi in (B.445) 
or the sum Yi in (B.434); see also the end of this section. 

We now prove Theorem B.136. First, as soon as dim(74) = for some A 0, 
then X ^ fo(/). Therefore, dim)//;^) < oo by Theorem B.135. In fact, is is easy to 
show directly that dim(ker(a — A)) < oo for any a G Bq{H) and A 0: since a is 
bounded and hence ker(a — A) is closed, the latter is a Hilbert space in its own right, 
so if it were infinite-dimensional, any basis («„) of it would have the property that 
> 0 weakly and hence au^ —?► 0 in norm (cf. the proof of the above lemma). But 
aun = Am„, so that {aun) cannot converge in norm as soon as A 0. 

Second, take 0 A G G{a). According to Theorem B.93, in order to prove that 
A G C7p(a), it suffices to show that ran(a — A) is closed. We may assume that A A,- 
for all i G I (for otherwise, trivially A G Gp{a)), which implies ker(fl — A) = {0}. 

Let xj/n = (a — X)<p„ G ran(a — A), with (PnX^O for all n, and suppose xj/^ -g xj/. We 
prove that {(p„) is bounded. If not, then ||^„|| —^ but since (^') is bounded, with 

= (Pn/\\(Pn\\, and (v/«) converges, we have {a-X)(p'„ = V^n/||^«|| 0. Now a is 

compact, so {a(p'„) has a convergent subsequence, which together with the previous 
result implies that (cp^) itself must have a convergent subsequence (as A 0), say 
to (p' . Continuity of a gives (a — X)(p' = 0, hence (p'„ G ker(a — A) = {0}. But this 
is impossible, as ||^'|| = 1 for all n. Thus knowing that (<p„} is bounded, once again 
using compactness of a, we infer that (axpn) has a convergent subsequence. Now 

(p„=X-^ia(p„-{a-X)(p„)=X-\a(p„-xi/„), (B.449) 

and since (v/„) converges by assumption, this implies that {(pn) has a convergent 
subsequence, say with limit (p. Continuity of a then implies that 

y/= (a —A)^ G ran(a —A), (B.450) 

and hence ran(fl — A) is closed. Therefore, A G O'p(a). 
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To show that 0 G <T(a), assume that a were invertible (which is to say that 
0 G p{a)). Then its inverse would be bounded, so that = 1h G Bo{H) 
by Theorem B.130. But this is impossible in infinite dimension; a similar argument 
to the one below (B.445) shows that 1h cannot possibly be approximated by finite- 
rank operators. The last claim of Theorem B.136 is the same as A G £o{I). □ 

Here is a nice example of compact operators, also justifying the notation Bq{H). 

Corollary B.140. Let H = £^(N) and for f G define the multiplication oper¬ 

ator mf as usual, i.e., mfXj/ = f\j/. Then mf is compact ijf f G ^o(N). 

Proof. This follows from Theorem B.135, where the label set is / = N, the basis 
{Vi)i(zi is where 5„(m) = as usual, m G N, and the eigenvalues are 

=/(«), (B.451) 

since obviously = /5„ = f{n)5„. We already know from (B.276) that a{mf) = 
ran(/)^, which for / G fo(N) equals ran(/) if 0 G ran(/), and 

ran(/)-=ran(/)U{0}, (B.452) 

otherwise. In the first case, a{mf) = Opfuf) = ran(/), so afimf) = 0, whereas 
in the second case we have ap{mf) = ran(/) and afimf) = {0}. This also shows 
that in clause 4 of Theorem B.136, both possibilities 0 G C7p(a) and 0 G Oc{a) may 
occur, depending on a. Finally, the condition X G which in the example a = mf 
reduces to (B.441), is just a restatement of the condition / G fo(N). □ 

In the continuous case, for H = L^{X), say for some connected open set X C K" 
with Lebesgue measure, the multiplication operator mf defined by a function / G 
Co{X) is never compact, cf. (B.276); it is the very opposite of a compact operator! 

To close, in our (traditional) proof of Theorem B.136 we did not use the pow¬ 
erful spectral Theorem B.94. If dim(//) < Theorem B.132 indeed follows from 
Theorem B.94: if, for A G M, we define 1{;^} = : R —>■ C by 5;^ (x) = 5)^^, then 

ida,(u)= E (B-453) 

leopia) 

la,(u)= E (B.454) 

leap (a) 

Now define e^ = 5f^{a). Then (B.290) - (B.291) give so that e^ is 

a projection. Furthermore, since idop(a) ■ Si = X ■ di, eq. (B.290) gives aei = Xei, 
so that eiH C Hi. Applying the map / i—/(a) to (B.453) - (B.454) then yields 
(B.433) - (B.434), from which the equality eiH — Hi follows a fortiori. 

If dim(//) = oo and a G Bo{H)sa, this still works for each nonzero X G O'p(a), and 
since the sum (B.453) converges uniformly in C((7(fl)), we obtain (B.433) in the 
same way, including its norm-convergence. Unfortunately, even if we replace Gp{a) 
by (7(a), as we should, eq. (B.454) now fails, even pointwise, so that (B.434) still 
requires the kind of proof we gave (or a complicated argument based on (B.327)). 
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B.20 The trace 

For finite-dimensional H the trace was defined by (A.77). There are (at least) two 
difficulties in generalizing this expression to the infinite-dimensional case in the 
naive way. First, not every operator has a finite trace; for example, take a = 1//, so 
that Tr (1//) = dim(//). Second, Lemma A.25 is no longer valid in general: it is easy 
to find an operator a G B{H) and bases (i),) and (u') of H for which 

Y^iVi^aVi) 
i i 

typically because one of these expressions converges, whereas the other diverges. 
For example, take a = a strong limit, i.e., 

this lies in H by Theorem B.61, from which (B.214) shows that Ijai/H = || i/l]. Take 
v[ = (tti -h t>2)/V2, v '2 = (Ul - 'U2)/V2, ^ = (t>3 + V 4 )ly/ 2 , v'^ = {v^ - V 4 ) jy/l, 
etc. Then Y.ii'^^iiCiVi) = diverges, whereas = Li'O = 0. 

However, if a € B{H) is positive, i.e., a > 0 in the usual sense that (v/,av/) > 0 
for each y/ G H, then we will show that for any two bases (d,) and (u') of H, 

Y(Vi,aVi} =Y(v!,aV-} (B.455) 

i i 

where both sides may be infinite. Equivalently, (A.79) is valid, since any unitary 
operator defines and is defined by a basis transformation. To prove (B.455), we 
need a very useful construction of independent interest, cf. (A.73). 

Lemma B.141. Any positive operator a G B{H) has a (unique) square root, i.e., a 
positive operator ^/a G C*{a) that satisfies ^/a^ = a. 

Proof. This follows from Theorem B.94, since if a > 0, then a{a) C and hence 
is defined on (7(a). Alternatively, one may use the following construction due 
to the Dutch mathematician C. Visser (which is a special case of the approach just 
mentioned). If necessary, first rescale a so that ||a|| < 1, take the power series for 

srr^=Ytk^, (B.456) 

k>0 

(in which fo = 1), which converges absolutely for |x| < 1, and put 

v^= (1^-457) 

k>0 

As in the numerical case, squaring the series and rearranging terms yields ^/a^ = a. 
Since uniqueness will not be needed, we omit the proof. □ 

For a > 0, we now use (B.215) to compute 
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Y,{Vi,aVi) = Y,{VaVi,y/aVi) =Y,{VaVi,Vj){Vj,y/aVi) 

i i ij 

= Y^{'/av'j,Vi){Vi,y/^v'j) = (B.458) 

i,J J 

where each term in every sum is positive, so that rearrangements are valid. Let 

B{H)+ = {a G B{H) | a > 0}; (B.459) 

In view of (B.458), we have a well-defined map 

Tr [0,°o]; (B.460) 

Tr(a)=^(t;,-,at;,), (B.461) 

i 

where (tt,) is an arbitrary basis of H, of which the result is independent by (B.455). 

To drop the restriction a > 0 in the argument of the trace, for any a G B{H) we 
note that a*a > 0, so that we may define the absolute value \a\ of a by 

\a\ = Va*a. (B.462) 

Then \a\ > 0 for all a by construction, and if a > 0, then \a\ = a. Finally, we define 
the set of trace-class operators in B{H), later seen to be a Banach space, as 

B^(H) = {fl G B{H) I Tr(|a|) < °o}. (B.463) 

The trace-norm of a G (//), which for now is just a formula, is given by 

||a||i=Tr(|a|), (B.464) 

Lemma B.142. 7. For any a G Bi (77) we have 

||a||<||a||i. (B.465) 

2. Any trace-class operator is compact, i.e., B\{H )cBo{H). 

3. For b G 71(77) and a G B\ (77) one has (A.100), i.e., |Tr (fl7')| < ||a|| 1 1|7>||. 

4. The trace-class operators B\(H) form a vector space with norm (B.464). 

Part 4 will shortly be improved to 71i (77) actually being a Banach space. 

Let us note that Lemma A.28 and Proposition A.29 on the polar decomposition 
remain valid for infinite-dimensional Hilbert space, with essentially the same proof. 

Proof. 1. By definition of the operator norm (B.227), for every e > 0 there is a 
unit vector y/ G 77 such that for any b G B{H) one has ||7>||^ < ||7»V/||^ -I- e (proof 

by contradiction). Put b = (a*a)'/"^, and note that ||(a*a)^/^|P = |||a||| = ||fl|| by 

(C.2) and (A.93). Completing y/ to a basis (u,), and noting that 

^||(a*a)'/V||2 = ^((fl*a)i/S.,(a*a)i/4^.) (B.466) 
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||a|| = + £ = ||a||i+£. 

i 

Since this holds for all £ > 0, one has (B.465). 

2. Let a G Bi{H). Since for each £ > 0 we can find n such that 

|a|T>i) < £• Let e„ be the projection onto the linear span of 
Using (C.2) in the form ||a|p = 11 ( 3 ( 3*11 (which is valid by (A.22)) and (B.465)), 

= lk^|ak^ll < \\en\a\e^h = =Y,{'0i,\aWi) <£, 

i i>n 


for |(ej-|a|e^)| = for if c > 0 then b*cb > 0 for any b,c G B{H). Since 

ej- = 1 — e„, it follows that e„ |( 3 | |a| *(2 norm topology. Since each op¬ 

erator e„|( 3 |L 2 obviously has finite rank, |( 3 |L 2 and hence |( 3 | is compact. Finally, 
a has polar decomposition a = u\a\ and Bo{H) is a two-sided ideal in B{H). 

3. We just showed that a is compact. By Theorem B.130, also a*a is compact, and 
since it is self-adjoint. Theorem B.136 applies. This gives an expansion (A. 101); 
although the sum may be infinite, this is no problem, as it is norm-convergent. 
Thus the computation will be analogous to the finite-dimensional case, cf. Propo¬ 
sition A.30, expect that we cannot use (A.78), which is valid but has not been 
proved yet. Fortunately, this problem may be obviated using (A.94). It follows 
from Lemma A.28 and Proposition A.29 that (vj = uVi) also forms an orthonor¬ 
mal set, like the i), themselves, since the closed linear space spanned by the 
unit vectors t), is just (ran|(3|)^ and u is unitary from this space onto its image 
(rana)^. Taking the trace over any basis that contains the vectors i)', we compute 

\Tv{ab)\ = \Tv {u\a\u* ub)\ = 

i 

< ^E^’'ll^llll“llIll'll = ll«llill^ll> (B-467) 

i i 

where we used ||(3||i = which follows from (A.101) applied to \a\. 

4. Let a,b G Bi (H) , and let (3 -f h = m |(3 -f h| be the polar decomposition. Then 

||( 3 -|-h||i =Ti{u*{a + b)) =Ti{u*a) + Ti{u*b). 

Applying (A. 100) with ||m*|| < 1, one has ||( 3 -|-h||i < ||( 3 ||i + ||h||i. Hence Bi (H) 
is a vector space and || • ||i satisfies the triangle inequality. The other axioms for 
a norm are obviously satisfied. □ 

Proposition B.143. Let H = i^{N) (or even (X ), for any countable set X), and for 
f G £“(N), define the corresponding multiplication operator mf by mfXj/ = f\j/, cf 
Proposition B.73. We have seen that mf is bounded, with norm (B.239). Then: 

#/g4(N); (B.468) 

mfGBi{H)ijffGi\n)-, (B.469) 

lh/lli = ll/ll 1 - (B.470) 
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Here ^o(N) consists of all / : N — >■ C for which limx^oof{x) = 0. 

In particular, Ifdim{H) =oowe have proper inclusions 

Bi{H)ciBo{H)ciB{H). (B.471) 

Proof 1. For any a € B{H) we have a € Bo{H) iff \a\ G Bq{H) by the polar decom¬ 
position (since a = u\a\ and \a\ = u*a and Bo{H) is a two-sided ideal in B{H)). 
In the present case, we have \mf \ = ^m^nif = = m|y|, whence nif G 

Bq{H) iff m\f\ G Bo{H). Since ap{m\f\) = {\f{x)\,x G N}, part 6 of Theorem 
B.136 applied to a = states that / G fo(N). 

2. This rapidly follows by computing Tr(|m/|) = Tr(m|^|) in the basis Vx = 5x, 
X G N, where 5x{y) = 5xy, as usual. □ 

Proposition B.144. The map 

Tr ; Bi{H)^C- (B.472) 

a i-f- ^(t;;,flt;,), (B.473) 

i 

where (b,) is some basis ofH, is well defined, (obviously) linear, and independent 
of the choice of basis. Furthermore, (A.78), i.e., Tv(ab) = Tv{ba), holds. 

Proof. Taking a = 1// in (A. 100), we have |Tr(fl)| < ||fl||i < oo for a G Bi(H). In¬ 
dependence of the choice of basis follows by first decomposing a = a' + ia" , with 
a' = \ {a + a*) and a" = — \i(a — a*) self-adjoint, as usual, and subsequently using 
Theorem B. 132 to write a' = af — a'_, with 

af = ± Y. (B-474) 

AGo-p(a')nR± 

and likewise for a" . This makes a is a linear combination of four positive operators, 
whence the claim follows from (B.458) and the obvious linearity of (B.473). 

To establish (A.78), we first note that Tr(flM) = Tr(Mfl) for any unitary m; this is 
the same as (A.79), which has just been proved. The claim then follows from the 
following (generally useful) lemma. □ 

Lemma B.145. Any a G B(H) is a linear combination of at most four unitaries. 

Proof. By the previous argument, we may assume that a* = a, and for convenience 
we also assume that ||a|| < 1. In that case, Hav/H < ||v/|| and hence 1 — > 0, so 

that Vl — is defined, cf. Lemma B.141. Defining the two operators 

u±=a± is/ 1 — a^, (B.475) 

we find ufu± =u±u*y. = \h, making each u± unitary, and a = j{u+ + u^). If a fa*, 
the number of terms at most doubles. □ 

The deeper significance of the trace-class operators now emerges. 
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Theorem B.146. For any Hilbert space H, we have dualities and double dualities 


BoW* =Bi(7/); 

(B.476) 

Bx[H)* ^B(H)-, 

(B.477) 

Bq{H)** ^ B{H)-, 

(B.478) 

Bi{H)** ^B{H)*, 

(B.479) 


where the symbol = stands for isometric isomorphism. Explicitly: 

• Any norm-continuous linear map CO : Bq{H) —>■ C takes the form 

C0{b)=Tv{ab), (B.480) 

for some a G Bi (H) uniquely determined by CO, and vice versa, giving a bijective 
correspondence between co € Bo{H)* and a G B\(H) satisfying 

||®|| = ||a||i. (B.481) 

This equality remains valid if CO is regarded as an element of B{H)* via (B.479) 
and the isometric embedding B\ (H) ^ Bi (//)** (cf. Proposition B.44). 

• Any norm-continuous linear map % ; (If) —>■ C takes the form 

Xia)=Tiiab), (B.482) 

for some b G B{H) uniquely determined by %> cind vice versa, giving a bijective 
correspondence between % G B\ (H)* and b G B{H) satisfying 

\\x\\ = m. (B.483) 

Proof It is clear from (A.100) that B\{H) C Bo{H)*, with ||(b|| < ||a||i. For the 
opposite direction, we return to the projections e„ in the proof of part 2 of Lemma 
B.142. Taking the trace over the basis (Vi), we have 

||fl||i = Tr(|a|) = limTr(e„|fl|e„) = limTr(e„|fl|) = limTr(e„M*fl) 

n n n 

= limco{enU*); (B.484) 

n 

since co{e„u*) > 0, we have co{e„u*) < ||ai||||e„M*|| < ||ai||, whence ||a||i < ||ai|| 
(note that the limiting procedure is necessary here, since co{u*) would not be defined 
because typically u* is not compact). This proves (B.481). 

To prove (B.476), it remains to be shown that every co G Bo(H)* can be repre¬ 
sented as (B.480). Noting that Bo(H) is the norm-closure of the linear span of all 
operators of the sort a = \ \i/){cp\, where \i/,cp G H wte unit vectors, the functional co 
is determined by its values on those operators. Given co, we define a by its matrix 
elements {cp,a\j/) — a)(| V^)(^|). Evaluating the trace on a basis containing cp yields 
Tr(a|i^)(9|) = {cp,a\l/) and hence gives (B.480) on operators a of the said form, 
upon which the general case follows by continuity. 
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We now prove (B.477). As in the previous case, the inclusion B{H) C B\{H)* 
is clear from (A. 100), as is the inequality ||x|| < ||a||. This time, the proof of the 
opposite inequality uses a = | V/)(^|, in which case one easily obtains 


lllr)(^llli = IIV^IIII^Il, (B.485) 

which in the case of unit vectors equals unity. Assuming (B.482), this gives 

\x{b)\ = \X{W){<P\)\ = |Tr(|v/)(9l^)l = \{<P,b\l/)\< ||z|||||r)(9llli = llzll- 

(B.486) 

Combined with (B.228), this gives ||fo|| < ||x||, and hence (B.483). 

Finally, as in the previous case, given j, we find b though its matrix elements 
{(p,b\lf) — X(|v^)(^|), which gives (B.482) on the special trace-class operators de¬ 
fined hy a = \\i/){(p\. Noting that the linear span of such operators in dense (in the 
trace-norm) in Bi (H), once again this gives the general case by continuity. □ 

Corollary B.147. 1. The vector space Bi (H) is complete in the norm (B.464). 

2. B\(H) is a two-sided ideal in B{H) (a G B{H),b G Bi (H) ab £ B\(H) ^ ba). 

Proof. The first claim follows from (B.476) and the completeness of Bq{H)* (cf. 
Theorem B.33 and §B.9). The second follows from (A.100) and (A.78). □ 

This actually reveals a subtlety in (B.471): as a normed space, Bo{H) simply inherits 
the norm of B{H), in which it is complete. Clearly, B\(H) also inherits the norm of 
B{H), but that is the wrong one: firstly, Bi (H) is not complete in the operator norm 
(indeed, its completion is Bo{H)), and secondly, the operator norm is the wrong one 
for the fundamental dualities stated in Theorem B.146. 

The following trace-class operators occupy the center stage in quantum theory. 

Definition B.148. A density operator is a positive operator p G B\{H) such that 

Tr(p) = l. (B.487) 

Equivalently, p is a density operator iff it has a norm-convergent expansion 

P= E (B.488) 

AGo-p(p) 

where C7p(p) is some countable subset of K+ with 0 as its only possible accumula¬ 
tion point, the multiplicity mx = dim)//;^) of each eigenvalue A > 0 is finite, and 

E l-mx = \. (B.489) 

X^OpiX) 

Similarly, (2.6) holds just as in finite dimension, i.e., (B.488) is equivalent to 

(B.490) 
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where (u,) is a basis of H, and the coefficients {pt) satisfy pi > 0 and Y,iPi = 1- 
Furthermore, the pi have 0 as their only possible accumulation point and are such 
that each f > 0 occurs in the set {p,} at most finitely many times. Like (B.488), also 
the equivalent expansion (B.490) is norm-convergent by Theorem B.136. 

Definition B.149. Let H be a separable Hilbert space. An operator a € B(H) is 
called a Hilbert-Schmidt operator if for some (and hence any) basis (Vi) ofH, 

(B-491) 


We write B 2 (H) for the set of all Hilbert-Schmidt operators on H. 

The argument that the sum in (B.491) is independent of the basis is based on (B.215) 
and is analogous to the computation (B.458), thjis time even without the compli¬ 
cation of the square root, for we simply have Li'||at)(iP = etc. For 

a G B 2 {H), with foresight we define the expression (where (n,) is any basis of H): 


||a ||2 = s/Tv{a*a) 



(B.492) 


Theorem B.150. Let H be a separable Hilbert space. 

1. For any a G B{H) we have 


||a||<||a||2<||a||i. (B.493) 

2. Every Hilbert-Schmidt operator is compact, and refining (B.471) one has 

Bi{H)cB2iH)cBo{H). (B .494) 

3. The Hilbert-Schmidt operators B 2 {H) form a Hilbert space with inner product 

{a,b) 2 =Tx{a*b), (B.495) 

and a Banach space in the ensuing norm (A.2), which equals (B.492}. Clearly, 

B2{H)*^B2{H). (B.496) 

4. The Banach space B 2 {H) is a two-sided *-ideal in B{H), and if a G B 2 {H) and 

b G B(H) we have \\ba \\2 < ||^|| ||fl ||2 and \\ab \\2 < ||^|| ||a|| 2 - 

Proof. 1. Take any unit vector xj/ G H and complete it to a basis of H. This gives 
\\aw\\ < ||a|| 2 . Taking the supremum over all such \j/ gives the first inequality. 
The second one will be proved in the next item. 

2. With e„ from the proof of Lemma B.142.2, for a G B 2 {H) we define a„ = ae„, 
and note that because Li converges, ||(a —fl „)||2 = L^«+i —>• 0. 

By the previous item, \\a„ —>■ a\\ —>■ 0. Since a„ G Bo{H), by Proposition B.131 
also a is compact. For the second inequality in (B.493), Theorem B.136 yields: 
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||a|l2=EMi< =l|a|li (B.497) 

where the /r, > 0 are the eigenvalues of the positive compact operator fl*a; the 
eigenvalues of the compact operator \a\ = ^a*a are 

3. We first show that B 2 {H) is a vector space. For any a,b G B{H) we have 

2{a*a + b*b) = {a + b)*{a + b) + {a-b)*{a-b), (B.498) 

so that (a + b)*{a -\-b) < 2{a*a + b*b) and hence \\a + b\\\ < 2 (||a ||2 + ||^|| 2 - 
Therefore, if a,b G B 2 {H), then a + b G B 2 {H). Since ||Aa ||2 = |A|||fl|| 2 , it is 
clear that if a € B 2 {H), then Xa G B 2 {H). Hence B 2 {H) is a vector space. 
Furthermore, because of the identity 

3 

a*b=\Y, i\b + i'‘a)*{b + i’^a), (B.499) 

i:=0 

the inner product (B.495) may be rewritten as 

3 

{a,b)2 = Y^{ei,a*bei) = \ E (B.500) 

i k=0 

which shows that if a,b G B 2 {H), then {a,b )2 < This reconfirms the fact that 
the trace in (B.495) may be computed in any basis, since this is true for each term 
on the right-hand side of (B.500). Sesquilinearity of (B.495) is straightforward. 
To prove positive definiteness, we use part 1: if ||fl ||2 = 0, then ||fl|| =0 and hence 
a = 0, since we already know that || • || is a norm. 

Knowing that (B.495) is an inner product on B 2 (H), it immediately follows that 
II • II 2 is a norm on B 2 {H), since, as already noted, ||a ||2 = ( 0 , 0 ) 2 - 
Finally, to prove completeness, we pick a basis (Vi) in N and note that B 2 {H) 
is the closure of the linear span of all operators of the form a = Li,; 0 ,;|b,)(b,|. 
This is because of the continuity of the inclusion B 2 {H) G1 Bq{H) (which is true 
because of part 1 and the fact that Bq{H) is itself the closure of this linear span). 
An easy calculation then gives 

ll«ll2 = llE^d^.)(^<lll2 = EK/P- (B-501) 

ij ij 

Hence B 2 {H) is isometrically isomorphic to the space of square-summable se¬ 
quences [cij) indexed by N x N, which by Theorem B.9 is complete in the f'- 
norm ||c ||2 = Li,; kijl^- Hence B 2 {H) is complete, too. 

4. From (A.78) (proved in Proposition B.144) we have Tr(a*fl) = Tr(flfl*), so that 

a G B 2 {H) iff a* G B 2 {H). \fb G B{H) and a G B 2 {H), then ||feflB,'|| < ||fo|| ||flt;,'|| 
and hence ||fefl ||2 < ll^ll ll'^'lb, so ba G B 2 {H), and hence also a* G B 2 {H) and 
a*b* € B 2 {H). Similarly, ab G B 2 {H), with ||flfe ||2 < ||^|| ||a|| 2 - D 
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B.21 Spectral theory for unbounded self-adjoint operators 

Although there is hardly any distinction between bounded and unbounded self- 
adjoint operators in so far as the definition and elementary properties of the spectrum 
are concerned (cf. Definitions B.80 and B.85, Theorem B.91, and Theorem B.93), 
extending the various versions of the spectral theorem to the unbounded case is a 
highly nontrivial matter. There are many ways of accomplishing this, among which 
our presentation has the virtues that hrstly (in contrast to von Neumann’s original ap¬ 
proach based on the Cayley transform) we stay within the realm of self-adjointness, 
and secondly we preserve the C*-algebraic spirit of Theorem B.94. Thirdly, our 
treatment is sufficiently general to cover the two main applications in quantum me¬ 
chanics (viz. the Born rule and Stone’s Theorem). For those applications, setting up 
a functional calculus for bounded Borel functions suffices, but in order to state even 
the dehning property id^jj^) i—a of the functional calculus also for unbounded a (cf. 
Theorem B.94), unbounded continuous functions will also have to be incorporated 
(but we refrain from a further generalization to unbounded Borel functions). 

Our approach starts from the observation that (with slight abuse of notation) 


y:R^(-l,l); (B.502) 

y{x) = (B.503) 

y-\x) = x{l-x^)-^^^, (B.504) 

provides a homeomorphism K = (—1,1). This has an operatorial counterpart 

a{lH+a^y^^^ = b; (B.505) 

b{lH-b^y^/^ = a, (B.506) 

where the notation for the square roots should be carefully disambiguated as 

(l/r + fl2)-^/2 = ((l//+a2)-i)V2. (B.507) 


As we shall see, the operator (l//-|-a^)^* is bounded (and so is \H — b*b), of course), 
so that square roots are only taken of bounded operators, in which case they are 
dehned by Lemma B. 141. As in the numerical case (B.503), the correspondence a o 
b in (B.505) - (B.506) will turn out to be bijective, mapping the class of (possibly 
unbounded) self-adjoint operators into the class of self-adjoint pure contractions: 

Definition B.151. A pure contraction is a bounded operator b : H ^ H for which 

ll^rll<llrll (re//\{0}). (B.509) 

If b is in addition self-adjoint, this is equivalent to ||l7|| < 1 and ker(l7± \h) = {0}, 
i.e., ±1 ^ ctpib)', the argument is similar to the proof of Lemma B.137. 

Eqs. (B.505) - (B.506) form a special case of a more general correspondence. 
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Theorem B.152. The formal expressions 

b = a(l// + a*a)^'/2 = a((l^+ a*a)-i)i/2. (B.510) 

a = b{\H -b*b)-^l^ = b{{\H -b*b)'^/^)-\ (B.511) 

make rigorous sense and define a bijective correspondence between the class of 
closed operators a (with dense domain) and the class of (necessarily bounded) 
pure contractions b. This correspondence preserves the adjoint, in that 

(B.512) 

a* =b*{lH-bb*)-'^l^, (B.513) 

and hence specializes to a a bijective correspondence (B.505) - (B.506) between 

self-adjoint operators a and self-adjoint pure contractions b. 

The (bounded) operator b is called the bounded transform of a. 

Proof 1 . From b to a. If is a pure contraction, then \H — b*b>Q, since this means 

{W,b*bw)<{w,¥), (B.514) 

or II^V^lP < WwW^- Furthermore, lH~b*b is injective, since (Ih — b*b)\lf = 0 
implies ||v^||2 = contradicting (B.509). This implies that (Ih — b*b)''^^ 

is injective, as (1/r — b*b)^^^\l/ = 0 implies (1h — b*b)\)r = 0 and hence x^r — 0. 
Thus the inverse (B.508) exists, with domain 

D{{lH-b*b)-'/^)=van{{lH-b*b)'^^). (B.515) 

This domain in dense in H, since for any cGB (H) (which in our case is c = (1// — 
b*b)^l^) we have H = ker(c) ©ker(c)2-; for c* = c we have ker(c) = ran(c)2- 
and hence ker(c)2^ = ran(c)^, so that injectivity of c yields H — ran(c)^. Hence 
(B.511) is well defined on 


D{a)=r?it\((\H-b*bYI^). (B.516) 

To prove that a is closed, we write a = bc^^, as above, and note that 

G{a) = {(bxj/jCil/), y/ G H} — ran(v), (B.517) 

where v : H ^ H(BH is obviously defined by v\j/ = {byf,cyf). Hence 

\\vw\? = \\bw\\^ + \\cyf\\^ = \\bx^\\^ + \\^^^ (B.518) 

so that V is an isometry. As such, ran(v) = G(a) is closed. 

2. From a to b. By definition, D(\H + a*a) = D{a*a), with 

D{a*a) = {r G D{a) \ a*!/ G D{a)}. (B.519) 
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We show that \H+a*a : D{a*a) —> // is bijective. First, (B.237) implies 


H®H = G{a)®G{a)^ =G{a)®uG{a*), (B.520) 

so for any (y/i, v/ 2 ) G H(BH there are unique (p € D{a) and x G D{a*) such that 

\l/i = (p-a*x-, (B.521) 

\l/2=a(p + x- (B.522) 

In particular, for (y/i, 1 / 2 ) = (V^jO) we obtain 

\j/={lH + a*a)(p, (B.523) 

This shows both surjectivity and injectivity, since (p is uniquely determined by 
1 //. Consequently, the inverse 

{lH + a*a)-^ :H ^D{a*a) (B.524) 

exists as a linear map, and since 

||(l//+fl*a)^Vll = ll^ll < \\{lH+a*a)(p\\ = ||v/||, (B.525) 


we see that (1/2* is bounded, with || (1//^ || < 1. A similar argu¬ 
ment shows that (1// +a*a)^* is positive; 

(V/,(l//+fl*arV) = {ilH+a*a)(p,(p) = \\(pf + \\a(pf>0, (B.526) 

so that the square root (B.507) exists. As before, injectivity of (l// + fl*fl)^' 
implies injectivity of its square root, whence ran((l /2 is dense in H. 

Clearly, (1// maps ran((l// to 

ian{{lH +a*ay^)=D{a*a) <GD{a), (B.527) 

so that the operator 1? in (B.510) is defined on ran((l// + a*a)^*/^). We now show 
that b is bounded on the latter: for any xj/ G H we have 

11^(1//+ = ||fl(l/2 + fl*fl)^Vll^ 

= {{lH+a*a)^^\j/,a*a{lH + a*a)^^\j/) 

< ((l//+a*a)^ V)(l/7 + a*a)(l// +a*a) V) 

= {{lH+a*a)-^xi/,V) = ||(l// + a*ar‘/Vf, (B.528) 

so that b may be extended to all of H by continuity, with ||fo|| < 1. Still denoting 
this extension by b, we have 

b*b = (1// -f a*a)^'/Va(l// +a*a)^'/2 = 1// - (1// +a*a)^', (B.529) 
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from which it easily follows that is a pure contraction; for any y/ 0, we have 
Wbxj/f = (\j/,b*b\j/} = llV^f - < llV^f, (B.530) 

since \\Ih + a*a)^'-^^\j/\\^ > 0 by injectivity of (1//+ 

3. Bijectivity of the correspondence a b. If a is determined by b according to 
(B.511), then 

lH+a*a={lH-b*b)-', (B.531) 

so that 

{lH-b*by^^ = (1// +a*a)-'/2^ (B.532) 

whence 

b = b{\H -b*b)-'^l^{\H -b*byl^ = a{\H-b*byi^ = a(l// 

(B.533) 

Similarly, if b is defined by a according to (B.510), then (B.529), rewritten as 

lH-b*b={lH + a*a)-\ (B.534) 

reproduces (B.511). To see that the domains match, in view of (B.516) we need 

D(fl) =ran((lH+fl*a)^^/2). (B.535) 

The inclusion D{a) f) ran((l// +a*a)^'/^) already having been established in 
step 2 above, we prove the opposite inclusion C. Indeed, for any xj/ G D{a) we 
have 

Iff = {lH+aa)-^^^{b*a + (1// (B.536) 

where b is given by (B.510). This follows by taking inner products with (p G H: 

{(p, {lH + a*ay'^^^b*a\l/) + {(p, (1// +a*a)^ V) 

= {a*a{lH+a*a)-^(p, xj/) + {(p, (1// + a*a)-V) = {(p, V)- (B.537) 

4. Self-adjointness. Since a is closed we have a** = a (cf. Lemma B.74), so using 
a* instead of a in part 2 above, we have 

(1// ; H D{aa*) C D{a*), (B.538) 

bijectively. If, in addition, xj/ G D{a*), we may compute 

a*x^ = a*{\H + aa*){lH+aa*)-'^x^={\H+a*a)a*{lH + aa*)-'^x^, (B.539) 

from which it follows that 

= (1/r -\-a*ay^a*xj/. (B.540) 

Similarly, for any polynomial p in one real variable we have 
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a*+cia*) = p{{Ih+ a*a) ')«*!//. (B.541) 

By Weierstrass, we can find polynomials p„ such that 

lim p„{{l +x)^') = (1 (B.542) 

for any x > 0, also cf. the proof of Lemma B.141. Hence by Theorem B.94 and 
closeness of a* we obtain 

a* {Ih + = (1// + a*a)^*/^a*Vf = (a(l// +a*a)^'^^)*V^ 

= b*\^, (B.543) 

for \j/ G D(a*). Since the latter is dense, we have (B.512). Bijectivity of the cor¬ 
respondence a GG b then also implies (B.513). In particular, a* = a iff b* = b, 
which implies the last claim of the theorem. □ 

Though not needed in what follows, it would be a pity not to state: 

Corollary B.153. If a : D{a) —>■ H is closed (with D(a)^ = H), then: 

1. Ih -\-a*a is self-adjoint on D(a*a); 

2. (l// + a*a)^* = TTi o ii, w/iere.- 

• li : H ^ H (B H is defined by li\j/= (y/jQ); 

• : H (B H H (B H is the projection onto the graph G{a); 

• 7ti : H (B H ^ H is the projection ;ri (y/i, 1 // 2 ) = V^i onto the first coordinate, 

so that in total we duly have ;ri oeQ^a) o L ■ H H. 

3. The closure of o (in other words, D(a*a) is a core for a). 

Proof 1. Part 2 of the proof of Theorem B.152 yields positivity and hence self¬ 
adjointness of (1// The claim now follows from the (easily established) 

fact that the inverse of an invertible self-adjoint operator is self-adjoint, too. 

2. The reasoning following (B.521) - (B.522) yields Ttiegf^afiiiw) = where (p = 

(1// by (B.523). Hence 

(1//-f a*a);riec(a)ii = l/r; (B.544) 

^i^G(a)B(l//+a*a) = l/r- (B.545) 

3. This is a consequence of the fact that ran(l// -|- a*a) = H, cf. part 2 of the above 

proof, too. Indeed, we need to show that the graph of the restriction 

G{a\Dia-a)) = {{V,aw),V G D{a*a)} (B.546) 

is dense in the grapg G{a) = {(v/,av/), y/ G D{a)} within //©//. In other words, 
if yr G G(a) satisfies {<P, yf)H®H — 0 for each <P G G(aj£)(a*a))^ then yr = 0. With 
yr = {yf,ayr) and <P = ((p,a(p), where yf G D(a) and (p G D(a*a), we obtain 
{^,yf)H®H = {{iH + a*a)(p,yr)H, which indeed vanishes for each (p G D(a*a) 
iff y/ = 0. □ 
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To get a feeling for the constructions to follow, we first look at the bounded case. 
Proposition B.154. If a = a* is bounded and b is given by (B.505), then 

C*{a)=C*{b). (B.547) 

Furthermore, (7{a) C K and cy{b) C (—1,1) (both included as compact subsets) are 
homeomorphic via the maps (B.503) - (B.504), preserving eigenvalues, that is, 

a{a) = {/r(l | p e (J{b)}; (B.548) 

a{b) = {A(l I A G (7(fl)}; (B.549) 

Gpia) = I p G Opib)}; (B.550) 

ap{b) = {A(l +A2)-'/2 I A G Gpia)}. (B.551) 

Proof. By Theorem B.84 and Theorem B.93, O’(fl) C R and cy{b) C [—1,1] are 

compact. We now show that in fact cy{b) C (—1,1); in particular, ±1 ^ O’(fe). For if 
±1 G (y{b), then 1 is not invertible, so that, given that s/ln + a^ is invertible, 
by (B.505) the operator \/\h + a^ hza is not invertible. But since the function 

/±(x) = \/l+x2±x (B.552) 

is strictly positive on any compact subset of R, and 

s/\H + a^±a=f±{a), (B.553) 

the operator in question is invertible, with inverse /±(a)^* = (l//±)(a). Contra¬ 
diction. Having thus localized (y{b), it follows that in (B.504) is continuous 
on (7(b), so that, with a = we have a G C*(b) and hence C*(a) C C*(b). 

Similarly, b = y(a) and hence C*(b) C C*{a), whence (B.547). 

Eqs. (B.550) - (B.551) for follows from the explicit construction of the square 
root in the proof of Lemma B.141; if cxp = X\j/, then ^/c^f = VXxp. Likewise (more 
trivially), if c is invertible (whence A f 0), then = A^' i^. The same result for 
the full spectra follows either from the spectral mapping property (C.53), or from 
the following direct argument. Given (B.547), Theorem B.94 yields an isomorphism 
C((7(a)) = C((7(b)) of commutative C*-algebras, since we have 

C(a(a)) C*(a)=C*(b) C(a(b)). (B.554) 

Eqs. (B.548) - (B.549) then follow from the identities 

/(«) = (foy-'){b), f G C{a(a))- (B.555) 

g(b) = (goy)(a), g G C(a(b)), (B.556) 

which in turn follow from Theorem B.94. □ 
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Now suppose a is unbounded. In that case, its bounded transform b remains 
bounded, but its spectrum contains at least one of the points ±1. We abbreviate 

a{b) = a{b)n{-l,l). (B.557) 


Proposition B.155. If a and b are as in Theorem B.152, their spectra are related by 


a(a) = {p.(l - I/r G a(l7)}; (B.558) 

aib) = {A(l -f I A G a{a)}-. (B.559) 

(7p(a) = I p G Op{b)}; (B.560) 

ap{b) = {A(l -f A2 )-i/ 2 I ;t G Gpia)}. (B.561) 


If a is bounded this duly reduces to (and reproves) eqs. (B.548) - (B.551), since 
<y{b) n (—1,1) = <y{b), and the right-hand side of (B.559) is already closed in K. 

Lemma B.156. Let a =a* G B{H). Then the spectrum (7(a) according to Definition 
B.80 coincides with the set (7(a) in Definition B.81, where A = C*{a). 

Proof We must show that if (a — A)^* exists in B{H), then its exists in C*(a) (in 
the double sense that (a — A)^* lies in C*(a) and is the inverse of (a — A) in C*(a)); 
the converse is trivial. Using Theorem B.94 as well as the obvious invariance of the 
spectrum (as in Definition B.81) under isomorphism, we might as well show that if 
(a —A)^* exists in B{H), then the function (id^jj^) — A)^^ exists in C{(j{a)). This 
is the case, since, by definition of (7(a), the antecedent holds iff A ^ (7(a). □ 

We apply this lemma with a bin order to prove Proposition B.155. 

Proof We know from (B.516) that \/\h — b^ '-H ^ D{a) is a bijection. If A G p(a), 
then both maps in the following diagram are bijections: 


H D{a) H, (B.562) 

and this is the case iff (a — A) o s/Ih — b^ is invertible, which, using (B.505), is 
true iff P — A s/Ih — b^ is invertible. Hence A G (7(a) iff P — A s/Ih — b^ G C* (b) 
is not invertible in B(H), or, equivalently, in C*(b). Define gi{y) = y — Xs/ln—y^ 
in C{(7{b)), so that gx{b) = b — X\/Ih — b^. Theorem B.94 (again with a b) 
then implies that A G (7(a) iff gx is not invertible in C{G{b)), which according 
to (B.253) (with A = 0) is true iff 0 G ran(g;^^). Since g;^(±l) = ±1 0, even if 

±1 G G{b), these values play no role, so that 0 G ran(g;L) iff A = /r(l — p ^)^2 for 
some jtt G G{b) fl (—1,1). This yields (B.558) for (7(a) and G{b). 

The claimed refinement to the point spectrum follows as in the proof of Proposi¬ 
tion B.154. The same argument shows that any p G G{b) iT (—1,1) must come from 
A G (7(a), and since G{b) must be a closed subset of [—1,1], this gives (B.559). □ 

As an illustration, take a to be the position operator on H — T^(]R), so that b — mf 
with f{x) =x/vT+)?. Eq. (B.276) then gives (7(a) = K and G{b) = [—1,1]. 
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If a is bounded, there are only two (commutative) C*-algebras to be concerned 
with in a spectral theorem a la Theorem B.94, viz. C((7(a)) and C*{a). In the un¬ 
bounded case, where a(a) C K is no longer compact, already no fewer than four al¬ 
gebras of continuous functions are associated with the spectrum, namely (cf. §B.3): 

• the set Cc{<y{a)) of all continuous functions /; a{a) —>■ C with compact support, 

• the set Co((7(fl)) of all continuous functions / ; G{a) —>■ C that vanish at infinity, 

• the set Cb{<y{a)) of all bounded continuous functions / ; a{a) —>■ C; 

• the set C((j(a)) of all continuous functions / ; (7(a) C. 

Of these, the second and the third are commutative C*-algebras in the supremum- 
norm; the first fails to be closed in this norm, whereas the last does not carry it (as 
it would be infinite on any unbounded function). We have the obvious inclusions 

Cc{(y{a)) C Co{a{a)) C Cb{(j{a)) C C(o’(a)). (B.563) 

Each of these plays a role in spectral theory (as do measurable versions of them). On 
the side of the bounded operator b, on top of C{G{b)), we have analogous function 
algebras, this time with inclusions 

Cciaib)) c Co{a{b)) c C{a{b)) c Cb{a{b)) c C{a{b)), (B.564) 

since C{G(b)) consists of all functions g in Cb{&{b)) for which limj,^±i g{y) exists, 
which limit is equal to zero iff g G Co(d’(^)). Since w' : (—1,1) —>■ R in (B.504) 
restricts to a homeomorphism d{b) —^ O'(a) because of (B.558), the map 

C.((j(a))4c.(a(fo)), f^foy-\ (B.565) 

is an isomorphism for • = c,Q,b, or blank (which is isometric for 0 and b). If f G 
Co(c7(a)), as in (B.555) (but no longer assuming a to be bounded), we may define 

f{a) = {foy-^){b), (B.566) 

since f oy^^ G Co{a{b)), and in view of (B.564), the right-hand side is defined by 
the continuous functional calculus for b, i.e., g i— g{b), where g G C(o'(b)); the 
same is then true for / G Cc(o'(a)). Let the (typically non-unital) *-algebras 

C:(b) = {g(b)lgeQ(a(b))}; (B.567) 

C^(b) = {g(b)lg€Co(a(b))}, (B.568) 

be the pertinent images under this calculus. In view of (B.568), we then have 

C*(b)cC^(b)cC*(b)cM(Q(b))cM(C*(b)), (B.569) 

where M{CQ{b)) and M(C*{b)) are the multiplier algebras of and C*{b), 

respectively, cf. §C.10. Note that M{CQ{b)) is a C*-algebra contained in B{H), 
whereas M(C* {b)) consists (partly) of unbounded operators (see below). 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



B.21 Spectral theory for unbounded self-adjoint operators 


633 


Lemma B.157. The (finite) linear span C*{b)H of all vectors of the form g{b)\j/, 
where g € Cc{6{b)) and € H, is dense in H, i.e., C* {b)H^ — H. 

This would be trivial for C* {b)H, since unlike C* {b)H it contains the unit 1//. 

Proof Approximate pointwise by some monotone increasing bounded se¬ 
quence (/„) with compact support, cf. Lemma B.97; for example, define 


/„:(-l,l)^K; (B.570) 

Mx)=0 (xG (-1,-1-f l/n],xG [l-l/n,l)); (B.571) 

Mx) = l (xG [-l+2/n,l-2/n]), (B.572) 

and linear interpolation elsewhere. As in (B.317), we then have /„ (b)^ 1 h strongly. 
By definition of C* (b), this yields the claim. □ 

Theorem B.158. Let a be a (possibly unbounded) self-adjoint operator on H. 

1. For any f G Ch((7(a)), the operator f{a)o, initially defined by linear extension of 

f{a)oh{a)\if=(fh){a)\lf={{fh)oy-^){b)\if, (B.573) 

i.e., defined on the domain CQ{b)H^ (cf. (B.565) with • = 0), is bounded, with 

ll/(a)ll<ll/ll~, (B.574) 

and hence extends from CQ(b)H to all ofH by continuity; we write 

/(a)=/(a)o. (B.575) 

2. The functional calculus f i—>■ f (a) from Ch{<j(a)) to B(H) thus established satis¬ 
fies the algebraic rules (B.289) - (B.291), and one has the reassuring cases 

lo{«)(a) = !//• (B.576) 

—-(a) = (a-z)^' (zGp(fl)). (B.577) 

Conceptually, what is going on here is that the homomorphism 

Co((j(fl))(B.578) 
/^/(fl), (B.579) 

as defined in (B.566), is extended to the multiplier algebra 

M(Co(cy{a)))=Cb{a{a)). (B.580) 

Theorem C.77 then applies, since by Lemma B.157 the initial homomorphism is 
nondegenerate, immediately yielding boundedness of f{a). Below we will also give 
an independent proof of (B.574). 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



634 


B Basic functional analysis 


Proof. The operator /(fl)o is densely defined by Lemma B.157 (which a fortiori 
implies that CQ{b)H is dense in H). To prove that /(fl)o is bounded, take e > 0 
and hence find a compact subset K cM. such that \f{x)h{x) \ < e whenever x ^ K. 
Writing / = /oy^* etc., using (B.322) with / lK‘^fh we obtain 

\\0^h){b)yf\\ < ||(l^)(fi)||||v/|| < II W7t||oo||v/|| <e||rl|. (B.581) 

From this, using also the homomorphism property in Theorem B.102, we then find 

||(/fi)(a)V/|| = ||(7fi)(fi)v/|| 

= ||^)(fi) + (7fi-l7A)(fi)v/|| 

< ||(i7A)(^)rll + ll(W^)(^)rll 

= ||^(fi)fi(fi)v/|| + ||(WA)(fe)v/|| 

< ll(l^/)l|oo||/i(fl)r||+e||r||, 

< ||/|U||fi(a)V/||+e||v/||, (B.582) 

since 

Halloo <||/||oo = ||/||oo. (B.583) 

Since the last expression in (B.582) is independent of K, we may let e -> 0, obtaining 
boundedness of f{a) as well as (B.574). 

The second claim should be obvious from (B.566) and Theorem B.94. 

Eq. (B.576) is trivial. To prove (B.577), write f{x) = (x —where z & p{a) 
is fixed and x G G{a). We have 

f{a)oh{a)\i/ = {fh){a)Y= {a — z)^^h(a)\i/, (B.584) 

and hence 

/(fl)o(p = (a-z)->, (B.585) 

for any (p G D{f{a)o) = Cl{b)H. So if -4 ^ for ^ G // and (pn G D(/(fl)o), 
boundedness and hence continuity of the operator (a — implies 

f{a)(p = lim f{a)o(pn — lim(a —z)^V« = (a —^ 

n^oo n^oo 

To construct a (typically unbounded) operator f{a) for / G C(G(a)) in this fash¬ 
ion (think of a itself, corresponding to / = id^jj^j), we first define 

D{f{a)Q) = C*{b)H = span{/i(a)v/' | h G Cc(c7(a)), ij/ G H}, (B.586) 

and an operator /(fl)o : D{f{o)o) H may once again be defined by (B.573); once 
again, the whole point is that although / may well be unbounded, h and hence fh 
lie in Cc{G{a)), so that {fh){a) is defined by (B.566), and hence eventually by the 
continuous functional calculus for the bounded self-adjoint operator b. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 




B.21 Spectral theory for unbounded self-adjoint operators 


635 


As in the remark following Theorem B.158, from the point of view of multi¬ 
plier algebras, eq. (B.573) extends the (nondegenerate) homomorphism Q(c7(a)) — 
B{H) to the algebra C((7(a)) of unbounded multipliers on Cc((7(a)). 

This is not the end of the construction, since /(fl)o is typically not closed on the 
domain (B.586). However, it is a very near miss, since /(a)o is closable, cf. §B.13. 
To prove that the operator /(fl)o in B.573 is closable, we use the second criterion in 
Lemma B.74. For g,h G Cc(o'(a)) and \i/,(p G H we may compute: 

{8ia)(pJia)of{a)\if) = {(p,g{a)* f{a)oh{a)\ir) = {(p, {g* fh){a)^)-, (B.587) 

((gr)(fl)(p,/r(fl)V/) = {(p,igr){aygia)w) = {(p,ig*fh){a)w). (B.588) 

Hence ^(/(a)^) must contain D(/(a)o), and on the latter we may put 

f{arog{a)(p = igr){a)(p, (B.589) 

as in (B.573). In particular, ^(/(a)^) is dense in H, so that /(fl)o is closable. Fur¬ 
thermore, if f* = /, then /(a)o is symmetric, i.e., /(fl)o C f{a)Q. Hence the closure 

f{a)=f{a)^-.D{f{a))^H, (B.590) 

is the operator we are looking for, where D{f{a)) consists of all xj/ G H for which 
there exists a sequence (!//„) in D{f{a)o) such that \j/„^ \j/ and converges, 

upon which Lemma B.74 gives 

/(fl)V/ = lim/(a)oVfn- (B.591) 

n 

What’s more, if f* = /, then /(a)o is essentially self-adjoint, i.e., 

f{a)^=f{a)l, (B.592) 

which (by taking the adjoint) is equivalent to the property we will actually prove: 

f{ay=f{a). (B.593) 

Theorem B.159. For real-valued f G C{(7{a)), the operator f (a) is self-adjoint. 
The proof of self-adjointness relies on Nelson’s Lemma: 

Lemma B.160. Let c G c* be densely defined and symmetric. Then c is essentially 
self-adjoint if there exists a continuous unitary representationt i—>■ Ut o/K on FI such 
that Ut: D{c) -G D{c) for each f G K, and 

y/ = lim =icu,\^, GD{c). (B.594) 

dt s 

This lemma is closely related to Stone’s Theorem; see Theorem 5.73 in §5.12. 

Proof. The proof of Nelson’s lemma relies on the following variation of Lemma 
5.74 in §5.12, proved by applying the latter (or rather its proof) to the closure of a: 
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Lemma B.161. Let a be symmetric. Then a is essentially self-adjoint (a** = a*} ijf 

ran(a + /)^ = ran(a — i)^ = H. (B.595) 

Applying Lemma B.161 in the same way as Lemma 5.74 is used in the proof of 
self-adjointness of the generator a in Theorem 5.73, yields Lemma B.160. 

For Theorem B.159, with c = /(a)o for some / G C((7(a),R), informally define 

U[ = exp(itf(a)), (B.596) 

and formally define Ut as the closure of the bounded operator 

(ur)o = ef)(a) (B.597) 

defined by the bounded function e'(x) = exp(itf(x)) on C7(T), cf. (B.573). The ver¬ 
ification that t i-G- Ut defines a continuous one-parameter group of unitary operators 
on H is practically the same as in our proof of part 1 of Stone’s Theorem, and the 
proof of (B.594) is almost the same as a similar step in the proof of part 3 of that the¬ 
orem, so we will not repeat these here. Therefore, Lemma B.160 applies, showing 
that /(a)o is essentially self-adjoint. □ 

As an important special case of our continuous functional calculus, we have 

idcT(u)(fl) = a. (B.598) 

just as in the bounded case. Writing ao for the operator (id(y(-Q))o(a), eq. (B.573) 
gives ao<p = a(p for (p G D(ao), cf. (B.586). Let \j/ G D(af), so that there is a se¬ 
quence in D(ao) such that Xf/n ^ Xj/ and (aoXj/n) converges. Since a is closed, 
it follows that aoXj/n = axj/n -g axj/, so that xj/ G D{a). Hence Aq C a. Since both 
operators are self-adjoint, this implies a^ = a, which proves (B.598). The proof of 
(B.577) is similar but easier, since (a — is bounded. 

In similar vein, we may set up a functional calculus for bounded Borel functions 
of fl. If / G ^(a(a)), then /oy^* G so that (/o w')(h) is defined, cf. 

Theorem B.102, and we may define f{a) by (B.566). As in the continuous case, this 
map / 1 —>■ f{a) yields a homomorphism ^{a{a)) -G B{H), satisfying (B.322). 

What is still missing, however, is the von Neumann algebra W* (a) in which this 
homomorphism takes values. To close this section, we solve this issue. 

If c G B(N) and a is possibly unbounded, we say that (by convention): 

[a,c] =0 iff ca C ac, (B.599) 

that is, if c ■ D{a) C D{a) and caxj/ = acxj/ for each xj/ G D{a). We write {a}' for 
the set of all c G B{H) that commute with a. If a* = a, looking at the graph of a 
(and using the fact that a is closed), it is easy to see that {a}' is a strongly closed 
unital *-subalgebra in B{H). Therefore, by the bicommutant theorem, {a}' is a von 
Neumann algebra. Its commutant W*{a), defined in the usual sense (B.318), i.e.. 
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W*{a) = {aY. (B.600) 

Theorem B.162. Let a be a (possibly unbounded) self-adjoint operator on H. Then 

W*{a)=W*{b), (B.601) 

where b is the bounded transform (B.510) of a. Consequently, if f G lj§((j{a)) and 
the operator f{a) is defined by (B.566) and Theorem B.102, then f{a) G W* (b). 

Proof We will prove a more general result of independent interest. 

Definition B.163. A closed unbounded operator a : D(a) —^ H is affiliated to a von 
Neumann algebra A C B{H), written arjA, iff [a,c] = Ofor each c G A'. 

For example, if a* = a, then ariW*{a), and if ar\B for someB = B", then W*(a) C B. 

Proposition B.164. Let A C B{H) be a von Neumann algebra and assume a is a 
self-adjoint operator on H with bounded transform b. Then arjA ijfb G A. 

Proof The first step consists in the observation that ar\A iff [a, m] = 0 (or, equiva¬ 
lently, uau* = a) merely for each unitary u G A'. To see this, we strengthen Lemma 
B.145 (in which we replace a by c): if c G A', then c is a linear combination of at 
most four unitaries in A'. Indeed, the unitaries u± in the proof are constructed via the 
continuous functional calculus of Theorem B.94, and hence they lie in C* (c)cA'. 

The second step is to show that [a, u]=Q iff [b, u]=Q for any unitary u. This is a 
simple computation; if uau* = a, then, looking at the domains in question, 

u{\HPa^)-\* = {lH+a^)-'^-, (B.602) 

M((l// + fl2)-i)i/2„* = ((i^ + fl2)-iy/2^ (B.603) 

from which ubu* = b with b defined by (B.510). Similarly, if bu = ub, then uau* = a, 
where a is defined by (B.511). Theorem B.152 therefore yields the claim. □ 

Theorem B.162 now follows: taking A = W*{a), so that arjA, yields b G W*(a), and 
hence W* (b) C W* (a). On the other hand, taking A = W* (b), in which case b € A, 
gives ar\W*{b), and hence IT*(a) C W*{b). This yields (B.601), from which the 
final claim follows by our definition (B.566) and Theorem B.102. □ 

Using this language, it can be shown that for possibly unbounded Borel functions / 
on (7(a), the possibly unbounded operator /(a) is affiliated to IT* (a). Furthermore, 
there exists a Borel measure p on a{a) such that the map / i-G /(a) may also be 
seen as a so-called essential homomorphism from ^{a(a))/,yV{G{a)) into the *- 
algebra of normal operators affiliated with lT*(a), where jV{a(a)) is the set of 
/r-null functions on (7(a); this means that the algebraic properties hold after closure. 
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Notes 

The history of functional analysis is described from various points of view by 
Bemkopf (1966, 1967), Birkhoff & Kreyszig (1984), Brezis & Browder (1998), 
Dieudonne (1981), Monna (1973), Pier (2001), Pietsch (2007), Siegmund-Schultze 
(2003), and Steen (1973). Apart from von Neumann (1932), the other founding 
books of functional analysis—coincidentally from the same year, which closed the 
foundational era that began around 1900—were Banach (1932) and Stone (1932). 

The concept of a Hilbert space eventually emerged from Hilbert’s work on 
quadratic forms in infinitely many variables (see especially his fourth paper on the 
subject, Hilbert, 1906), which in turn was inspired by his analysis of integral equa¬ 
tions (Hilbert, 1912). From a modern point of view, Hilbert’s space was the unit ball 
in f^(N); he did not adopt the perspective of linear spaces and operators. 

An important step towards this perspective was what is now called the Riesz- 
Fischer Theorem from 1907; Riesz (1907a) proved the isomorphism 

L2([a,fo]) (B.604) 

whereas Fischer (1907) proved the completeness of L^{[a^b\) and obtained Riesz’s 
isomorphism as a corollary. Riesz (1907b) also obtained the the Riesz-Frechet The¬ 
orem for the special case L^([a,fe])), independently found also by Frechet (1907). 
In fact, Hilbert (1906) had already shown this {mutatis mutandis) for what we now 
call f^(N); the general case had to wait for Riesz (1934) and Lowig (1934). The 
latter was the first to study non-separable Hilbert spaces, including Corollary B.64. 
Both Riesz and Frechet in addition played major roles in establishing another fa¬ 
mous duality theorem, namely the one on the representation of linear functionals on 
continuous functions by measures (cf. Theorem B.15 etc.); see Gray (1984). 

Subsequently, Schmidt (1908) developed the linear and geometric structure of 
arguably the first Hilbert space studied as such, and Riesz (1913) explicitly 
studied linear operators on this space. Finally, it was von Neumann (1927ab, 1932) 
who first introduced Hilbert space and operator theory from an abstract point of 
view, i.e., axiomatically. For a historical analysis of this step, which was triggered 
by the attempts of von Neumann (originally jointly with Hilbert and Nordheim) to 
provide a mathematical foundation for quantum mechanics), see Redei (2005) and 
Duncan & Janssen (2013); also cf. Corry (2004) on the role of Hilbert himself. 

Functional analysis textbooks perused by the author include Conway (2007), 
Dudley (1989), Kadison & Ringrose (1983), Maurin (1972), Reed & Simon (1972), 
Rudin (1973), Schmiidgen (2012), and Weidmann (2000). A good place to start for 
contemporary beginners is Rynne & Youngson (2008), followed by the more ad¬ 
vanced text by MacCluer (2009), which also introduces C*-algebras. A natural next 
step would then be Pedersen (1989), and on to operator algebras! 

Since most of the material in this appendix is standard except for the last three 
sections, it seems pointless to give detailed notes and attributions (so that several 
section even lack notes), except for a few comments on unusual cases, and some 
supplementary material which would have distracted too much from the main text. 
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§B.2. spaces 

Holder’s Inequality (which incorporates the claim fg G £^) should be clear for 
p = 1 or p = oo. For 1 < p < oo, we use the fact that for any s,t G [0,°°), one has 

(B.605) 

p q 

Using (B.605) with s = (|/W|/||/|lp)^ and t = (|g(jc)|/||g||^)'^ and summing overx 
gives (B.15). To derive Minkowski’s Inequality for 1 < p < oo (the cases p = 1 and 
p = oo are obvious), define 


Kx) = \f{x)+g{x)y’ (B.606) 

Arguing as in part 1 above, if f G and g G i'’, then f + gG£^ and hence h G 
since h{x)‘^ = \h{x)\‘‘ = \ f{x) +g{x)\’’. Now compute 

ll/ + ^llp = E \fix)+8{x)\'’ = Y,h{x)\f{x)+gix)\ 

X X 

< E1^ w/w I+E1^ w I = ii/^ii 1+1 

X X 

<ll%(ll/llp+Wp) = ll/ + 5lir'(ll/ll/> + Wp)> (B-607) 

where in the last inequality we have used (B.15). This immediately gives (B.14). 
§B.4. Basic measure theory Standard textbooks on measure theory include Bo¬ 
gachev (2006), Dudley (1989), Malliavin (1995), Rudin (1986), etc. 

§B.5. Measure theory on locally compact Hausdorff spaces 

Urysohn’s Lemma states that if A is a locally compact Hausdorff space and K G1 
U C X with K compact and U open, then there is a function g G Cc{U) such that 
0 < g{x) < 1 for each x G X and g{x) = 1 for x G K. Similarly, since a locally 
compact Hausdorff space is completely regular, for each closed set U C A and point 
x^F there is a continuous function such that f{x) = 0 and /jp = 0. 

An example of a space that is locally compact Hausdorff but not a-compact, 
given by Rudin ((1986), is A = with topology given by the strange metric 
d{{.x,y),{x',y')) = l-p|y-y| ifx^x' and c/((x,y), (x,/)) = ly-y'l. 

For a (tedious) direct proof of Theorem B.19, see Rudin (1986), Thm. 2.14. Al¬ 
ternatively, Theorem B.19 may be derived from Choquet theory, as mentioned in the 
main text, or from the Daniell-Stone construction of measures from positive func¬ 
tionals in a more general setting, see e.g. Bogachev, 2007, §7.8 or Dudley, 1989, 
§4.5. For a proof of Theorem B.22 see Malliavin (1995), Thm. 5.3.8. 

The theory of finitely additive measures is exhaustively discussed in Rao & Rao 
(1983); for a summary see Luxemburg (1991). The notion of a semiring of subsets 
of A goes back to von Neumann (1950). See also Loya (2008), including a detailed 
proof that Step(A,.^) is a (commutative) algebra. 

§B.6. LP spaces 

An nice result “taming” LP{X,E,ij.) is Lusin’s Theorem, assuming p is regular: 
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Theorem B.165. Let 1 < p <oa. If the support of f £L^{X) has finite measure, then 
for any e > 0 there exists g G CfiX) such that /4({x G X \ f{x) f g(x)}) < £. 

§B.7. Morphisms and isomorphisms of Banach spaces 

The Baire Category Theorem states that a complete metric space cannot be a 
countable union of nowhere dense sets (where a set in a topological space is called 
nowhere dense if its closure has empty interior, i.e., does not contain a non-empty 
open set). In other words, if (M, d) is complete and M — U„M„ with each closed, 
then there is at least one n G N for which contains an open ball. 

§B.9. Duality 

The idea of writing (B.136) as limy / has the following origin. 

1. Let f :X -G' K he any function between any pair of sets, and let T be a filter on 
X. Then f^F, which consists of all B C fG for which f^^{B) G T, is a filter on 
K, called the push-forward of F by /. Moreover, if U is an ultrafilter on X, then 
ftU is an ultrafilter on K. This gives a map 

/* : Ultra(X) ^ Ultra(/:). (B.608) 

If we equip Ultra (X) with the topology generated by all sets of the form 

Ua = {t/ G j3X IA G U}, (B.609) 

where A C X, as in the main text, and likewise Ultra(fG), then /* is continuous If 
X is discrete, then Ultra(X) = j5X, but not otherwise. 

2. We say that some filter U on a topological case X converges to x GX if NxC F, 
where Nx is the neighbourhood filter of x, consisting of all neighbourhoods of x. 
This is denoted by limU = x. 

3. Combining points 1 and 2, if MmfiF = z, i.e., if C fiF, we write 

\imf = z{zGK). (B.610) 

F 

4. As for sequences, it can be shown that filters on Hausdorff spaces have at most 
one limit, and that ultrafilters on compact spaces have at least one limit. Conse¬ 
quently, ultrafilters on compact Hausdorff spaces K have exactly one limit, i.e., 
converge to a unique point. This gives a continuous map 

lim : Ultra(/:)-4 fG. (B.611) 

5. It follows that if X is any set (seen as a discrete topological space), fG is a compact 
Hausdorff space, f : X ^ K is some function, and U is an ultrafilter on X, then 
fiU has a unique limit zG K, written limy f = zoi lim fiU = z, or j3 f{U) = z, 
since the latter notation gives the extension fif in the diagram (B.135). Thus 
fif = limo/i., as in the diagram that combines (B.608) and (B.611), viz. 

fix = Ultra(A) 4 Ultra(/:) 4 K. (B.612) 
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§B.l 1. Choquet’s Theorem 

Our proof of Choquet’s Theorem was adapted from Simon (2011) and Ebbe- 
sen (2012). For an extensive treatment of the surrounding Choquet Theory see e.g. 
Alfsen (1970), Bratteli & Robinson (1987), or Phelps (2001). For the Schlafli clas¬ 
sification see Coxeter (1948). 

§B.12. A precis of infinite-dimensional Hilbert space 

To prove separability ofH = note that a dense subset is given by the set 

of all functions of the form \gdp, where n G N, = {x G | ||x|p < r} is the c/-ball 
of radius r, and p is some polynomial on with rational coefficients. Alternatively, 
take the complex rational linear span of all functions of the form 1^, where A C 
is a rectangle with rational coefficients (proving density in either case requires some 
measure theory). The latter construction has the advantage over the former that it 
can be generalized to Hilbert spaces H ~ L^{X) for which the underlying measure 
space {X,E,IJ.) satisfies the condition that the space of sets A G E with /4(A) < 
is separable in the metric d{F,G) = jx{FAG), where FAG — {E flF'’) U {E‘^ CiF) is 
the symmetric difference. Indeed, L^{X) is separable iff this condition is satisfied. 

This class includes the important case where the underlying topological space 
X is Polish (i.e., homeomorphic to a complete separable metric space), E consists 
of the associated Borel sets, and /i is a (7-finite regular measure. If, furthermore, /i 
is finite, then Femma B.121 (in its original form for Polish spaces) applies. As in 
the proof of Theorem B.118, this induces Hilbert space isomorphisms like (in the 
second case) L^{X) = L^{0, 1), which do not require a choice of basis. See Royden 
(1988), Thm. 15.5.16 and Prop. 15.5.12, and Halmos (1974), p. 177. 

§B.14. Basic spectral theory 

Our terminology “continuous spectrum” Gc{a) for the complement of the point 
spectrum dp (a) is not standard; many authors reserve the former term for the com¬ 
plement of Gp{a) as well as the so-called residual spectrum Gr{a), which is defined 
as the set of those X G (j(a) for which X ^ O'p(fl) and ran(a — X)^ ^ H- However, 
for self-adjoint operators a (which is all we need in this book, and in quantum me¬ 
chanics), it follows from e.g. Theorem B.93 that O'r(a) = 0, so that at least for a* =a 
“our” continuous spectrum Gc{a) matches with the usual terminology. 

The proof of (B.258) in any Banach algebra A with unit 1 a is as follows. We first 
show that the sum is a Cauchy sequence. Indeed, for n> m one has 


n 

m 


n 

n 

n 


-E«^ 

= 

E 

< E ll«1 

< E ll«ll' 

k=0 

k=0 


k=m-\-\ 

k=m+\ 

k=m-\-\ 


(B.613) 


For n,m ^ oa this converges to 0 by the theory of the geometric series. Since A is 
complete, the Cauchy sequence Y!k=o converges for Now compute 




k=0 


— a 


k=0 


,k+l 


) = \A-a‘ 


72+1 


(B.614) 


Hence 
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1 a - ^a^(lA-a) 
k=0 


= l|fl"+‘ll < llair+', 


(B.615) 


which converges to zero when n —oo, as ||fl|| < 1 by assumption. Thus 


lim Y,a\lA-a) = 1a- 


(B.616) 


By a similar argument, 


lim(lA-fl) = 1a, 


(B.617) 


k=0 


SO that, by continuity of multiplication in a Banach algebra, one finally has 


lim ^ = (1a — ta?) *• 


(B.618) 


,t=o 


To see that the closure of a closable operator a is indeed closed (!), suppose 
fn^ f and afn —>■ g, with (/„) in D{a^). Since /„ G D(a^) for fixed n, there exists 
{ fm,n) in D{a) such that lim„/;„ „ = /„ and lim„,a/m,„ = gn exists. Then clearly 

lim/m,n=/, (B.619) 

m,n 

and we claim that 

lima/m,„=g. (B.620) 

m,n 

Namely, - g|| < - af„\\ + \\af„ - g||. For e > 0, take n so that the 

second term is < e/2. For that n, the vectors a{fm.n — fn) converge, as m —>■ oo, since 
a/m.n —>■ gn and afn is independent of m. Also, recall that — /„ —>■ 0 as m —> oo. 

By assumption, a is closable, hence by definition one must have a{fm^n — fn) 0 
in m. Hence we may find m so that \\afm,n — ofn\\ < e/2, so that \\afm,n — .g|| < e, 
and (B.620) follows. Hence / G D{ar). Finally, since a^f = lim^,na/m,n one has 
a^f = g by (B.620), or a^f = lim„ af„ by definition of g. Thus is closed. 

§B.15. The spectral theorem 

By (B.319), von Neumann algebas like W*{a) are complete under strong conver¬ 
gence of nets (rather than merely sequences), and if some net is monotone increasing 
(or decreasing) and bounded, the strong limit equals the supremum (or infimum), as 
in Proposition B.98. This yields operatorial versions of (B.40) - (B.44): 


eu = sup{/(a) I / G Q(t/),0 < / < !„(«)}; (B.621) 

eK = inf{/(a) | / G C{a{a)),0 <f< la{a),f\K = 1/t}; (B.622) 

CA = infjec/ \ U f)A,U G ^((7(a))}; (B.623) 

= &w^{eK\K QA,K G (B.624) 


where U G ^(o’(a)) is open, K G J^{G{a)) is compact, and A C G{a) is Borel. 
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§B.16. Abelian *-algebras in B{H) 

For an alternative proof of Proposition B.106, one observes that 


W 


I fW= f bW={V\¥\,bW/V\w\) 

Jo Jo 


(B.625) 


defines a bounded functional on L^(0,1) seen as a dense subspace of (0,1), and 
use the duality L* (0,1)* = L“(0,1). Indeed, using Cauchy-Schwarz, one has 


/' 


fw 


= |(v1^,itv//N^)l<ll^llll^ll2llWv1^ll2 = 


(B.626) 


§B.17. Classification of maximal abelian *-algebras in B{H) 

Theorem B.l 18 goes back to von Neumann (1931); for the details of the second 
proof see Kadison & Ringrose (1986), §9.4, or, very lucidly, Stevens (2016). 

§B.20. The trace 

The trace is often neglected in functional analysis books, except when these tend 
to quantum mechanics (Reed & Simon, 1972) or to operator algebras (Pedersen, 
1989). Eqs. (B.476) - (B.477) and (B.496) reflect the function space dualities 

4(N)*^f^(N); (B.627) 

f'(N)*^r(N); (B.628) 

(B.629) 


Similar to the f^-spaces, one has Banach spaces Bp{H) residing in Bo{H) for each 
I < p <°°, called Schatten-von Neumann ideals, see e.g. Simon (2005). 

§B.21. Spectral theory for unbounded self-adjoint operators 

Our approach to unbounded operators via the bounded transform combines ideas 
from Kaufman (1978), Woronowicz (1991), Woronowicz & Napiorkowski (1992), 
Schmiidgen (2012), and Koliha (2014). The proof of Theorem B.159 via Lemma 
B.160 (due to Nelson, 1959), was suggested to the author by Nigel Higson. The last 
part of §B.21 was inspired by Lemma 5.2.8 in Pedersen (1989), in which we have 
simply replaced the Cayley transform by the bounded transform. 

The idea of affiliating closed operators to von Neumann algebra goes back to von 
Neumann; our brief treatment is hopefully more appealing than the elaborate con¬ 
structions in Kadison & Ringrose (1983), §5.6. A number of details were supplied 
in the M.Sc Thesis of Christian Budde (2015); see also Budde & Landsman (2016). 

Lor general C*-algebras A, the multiplier algebra consists of all maps m:A^ A 
for which there exists an adjoint n = m* :A^A such that b*m{a) = n{b)*a. Such 
maps are automatically linear and bounded, and M{A) is a C*-algebra itself as a 
subalgebra of the Banach space B{A) of all bounded linear maps on A, enriched 
with the adjoint m* = n. See, e.g., Lance (1995), or §C.10 below. Lor commuta¬ 
tive C*-algebras this reduces to the definition in the main text, which dates from 
Wang (1961). Lor unbounded multipliers see Woronowicz (1991) and Lance (1995); 
Woods (1979) treats the bounded case. 
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Appendix C 

Operator algebras 


This appendix provides a short course in operator algebras, building on the previous 
appendix. Indeed, there is surprisingly little algebra in the subject (so that there are 
hardly any prerequisites in that direction), and quite a lot of functional analysis, 
involving both operators on Hilbert space and more general Banach space theory. 

Traditionally, the field of operator algebras has had two branches: C*-algebras 
and von Neumann algebras. Although historically speaking the latter (invented by 
von Neumann in 1930) preceded the former (introduced by Gelfand and Naimark 
in 1943), the logical order of presentation is the opposite, since von Neumann al¬ 
gebras turned out to be special cases of C*-algebras (with additional structure). 
Furthermore, for reasons in the foundations of quantum mechanics (as explained in 
the main text), beside von Neumann algebras we will discuss a few lesser known 
special cases of C*-algebras, such as scattered C*-algebras and AW*-algebras. 


C.l Basic definitions and examples 

A C*-algebra is both an associative algebra and a Banach space, as follows: 

Definition C.l. 1. A Banach algebra is a Banach space A that is simultaneously 
an algebra in which 

\\ab\\ < ||fl|| ll^ll {a,b€A). (C.l) 

2. An involution on an algebra A is a real-linear map * : A —>■ A, written a i—>■ a*, 
such that a** = a, (ab)* = b*a*, and (Xa)* = Xa* for all a,b G A and A G C. An 
algebra with involution is also called a * -algebra. 

3. A C*-algebra is a Banach algebra A with involution in which 

||a*a|| = ||flf (agA). (C.2) 

With the same proof as (A.22), these axioms imply 

||a1l = ||a||- (C.3) 

© The Author(s) 2017 645 
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The three main examples (at least for a first orientation) are: 

• The space Co {X) of all continuous functions f : X ^ C that vanish at infinity, 
where X is some locally compact Hausdorff space (see §B.3). This is an algebra 
under pointwise operations: addition is given by {X ■ f + g){x) = Xf{x) +g(x), 
and multiplication is {fg){x) = f{x)g{x). Furthermore, it has a natural involution 
f*{x) = f{x), and a natural norm ||/||oo = sup^g;f{|/(x)|}, cf. (B.23). The above 
axioms of a C*-algebra are easily verified. Note that Co{X) has a unit (namely the 
function lx equal to 1 for any x) iff X is compact. It is of fundamental importance 
for physics and mathematics that Cq{X) is a commutative C*-algebra. 

• The space B{H) of all bounded operators on some Hilbert space H, with obvious 
algebraic operations, involution given by the adjoint (see (A. 15)), and the stan¬ 
dard operator norm ||a|| = sup{||av/^||, i// GH,\\\i/\\ = 1}. See Proposition A.7 and 
Theorem B.33 for the proof that B{H) is a C*-algebra; it has a unit, given by the 
identity 1//. If dim(//) > 1, this is a highly non-commutative C*-algebra. 

• The space Bq{H) of all compact operators on some Hilbert space H, with oper¬ 
ations inherited from B(H)-, see Theorem B.130, which not merely shows that 
Bo{H) is a C*-algebra, but also that it is a (closed) two-sided ideal in B{H). It 
fails to have a unit whenever H is infinite-dimensional (this follows from almost 
any result in §B.19, such as Theorem B.135). 

Definition C.2. 1. A homomorphism /jefween C*-algebras A and B is a linear map 
(p : A ^ B that for all a,b G A satisfies 

(p{ab) = (p{a)(p{b); (C.4) 

(p{a*) = (p(a)*. (C.5) 

2. An isomorphism between two C*-algebras is an invertible homomorphism. If A 
and B are isomorphic as C*-algebras in this sense, we write A = B. 

It follows from linear algebra that the set-theoretic inverse of an invertible linear 
map (p : A B is automatically linear. It is similarly easy to show that the inverse 
of an invertible homomorphism is itself a homomorphism, but it is a deeper fact 
about C*-algebras that an isomorphism is automatically isometric (and hence has 
an isometric inverse); see Theorem C.62. Furthermore, if B = C, then the property 
(p{a*) = (p{a)* follows from the other conditions on a homomorphism. 

The following notion, originally inspired by quantum mechanics (and turned into 
mathematics by von Neumann), gives a geometric flavor to operator algebras. 

Definition C.3. A state on a C*-algebra A is a bounded linear map 0): A —>■ C that 
satisfies: 

1. (o{a*a) >0, aGA (positivity); 

2. ||a)|| = 1 (normalization). 

If A has a unit, the definition of a state considerably simplifies. 

Lemma C.4. Let A be a C*-algebra with unit and let (0 : A ^ <C be a linear map. 
Then co is positive iff it is bounded and satisfies ||(a|| = ©(Ia). 
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The proof requires some positivity theory in C*-algebras, so we postpone it to §C.7, 
but as of now, we immediately infer that in the unital case we have: 

Proposition C.5. A linear map (O : A ^ C on a unital C*-algebra is a state ijf (0 is 
positive and satisfies ©(1^) = 1, and hence ijf 0) is bounded with ||©|| = ©(l/i) = l. 

Using the Banach-Alaoglu Theorem B.48, this implies that the state space S{A) of 
a unital C*-algebra A, i.e., the set of all states on A, is a compact convex subset of A* 
in its w* -topology. Defining the pure state space P(A) of A as the extreme boundary 
deS{A), the Krein-Milman Theorem B.50 almost immediately implies: 

Theorem C.6. Let A be a C*-algebra with unit, having state space 5(A) and pure 
state space P{A) = (9e5(A). Then P{A) f 0 and 5(A) = co(P(A))^. 

In words, C*-algebras have sufficiently many pure states to approximate general 
states arbitrarily well, at least in the w*-topology (of “expectation values”). 

The only complication in applying Theorem B.50 to K = 5(A) C A* is that A is 
a complex Banach space, but the situation may be reduced to the real Banach space 

Asa = {flGA |fl* =a}. (C.6) 

Lemma C.7. Let A be a C*-algebra with unit. If O) G 5(A), then co{a* ) = Co{a). 
Proof Using Definition C.3.2 and eq. (C.2), for any a* =a and f G K we have 

|©(fl + /f)|^ < ||a + /f|p = \\{a — it){a-\-it)\\ = ||a^ + f^|| < (C.7) 

Writing 0 }{a) = a-f/j3, where a,j3 G M, this gives + 2fit < ||fl|p for all f G 

K, which forces j3 = 0. This proves the claim for self-adjoint a. For the general case, 
one uses the following decomposition of a as a sum of two self-adjoint operators: 

a = b + ic {b* =b,c* =c)-, (C.8) 

b=\{a-t-a*), c = —^i{a — a*). (C.9) 

Consequently, we may restrict a state (O G 5(A) to a real-linear functional 

©K = ©1^^^: Asa-)> K (C.IO) 

that satisfies ©(1a) = 1 and ©(a ")> 0 for any a G Asa, where we used Theorem 
C.52 below to reformulate the positivity condition on states in terms of self-adjoint 
operators alone. Conversely, we may extend a state ©r on Asa to a state © on A by 

©(fl) = ©r(I7)-|-/©r(c), (C.ll) 

assuming (C.8) - (C.9). We then have ||©|| = ||©r|| = 1, since obviously ||©r|| < 
II ©II = 1 (since its sup-norm is computed on fewer operators), but also ©(1a) = U 
Thus we may regard 5(A) as a compact convex set in the real Banach space A*a rather 
than in the complex Banach space A*, and Theorem B.50 applies. Alternatively, one 
could have extended the latter to the complex case, which is possible with a similar 
(lack of) effort as in the procedure above. 
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C.2 Gelfand isomorphism 

The example A = Co{X) of a commutative C*-algebra given in the previous section 
is more than that; as proved in the very first (1943) paper on C*-algebras by Gelfand 
and Naimark (despite whom one often speaks of, it is generic. 

Theorem C.8. Every commutative C*-algebra A is isomorphic to Co{X) for some 
locally compact Hausdorff space X, which is unique up to homeomorphism. 

The proof is technically intricate at points, but the main idea is quite simple: 

1. The space X may be taken to be the Gelfand spectrum 2^ (A) of A, i.e., the set of 
all nonzero linear maps (O : A ^ C that satisfy (o{ab) = (o{a)(o{b) (and hence 
are homomorphisms A — C as algebras). For example, if A is already given as 
Co{X), then each x G X defines (Ox G X(A) by (Ox{f) = f{x), which is linear 
multiplicative (by the pointwise definition of addition and multiplication in A). 

2. The Gelfand transform maps each a G A to a function a : E (A) — C by 

d{(o) = (o{a), {a GA, (0 G E{A)). (C.12) 

3. The Gelfand topology is the weakest topology on 2^(A) making all functions a 
continuous (i.e., the topology generated by the sets d^^{U),U gC open, a G A). 
In this topology, Z (A) is compact iff A has a unit, and locally compact otherwise. 

4. The isomorphism A —Co(2;(A)), then, is just given by the Gelfand transform. 

This picture becomes even more compelling from the following observation: 

Lemma C.9. For any (i.e. not necessarily commutative) C*-algebra A we have 
E (A) C A*. Furthermore, for any (0 G E{A), 

||®|| = 1, (C.13) 

and if A has a unit, I a, then also 

®(U) = 1. (C.14) 

In other words, multiplicative linear functionals on A are automatically continuous 
(recall that A* is the Banach space of continuous linear maps from A to C, see §B.9). 

Throughout the rest of this section we restrict all proofs to the unital case; the 
general case may be handled by the technique of unitization to be discussed in §C.6. 

Proof. Let (OG E (A). By multiplicativity, ker(a)) is a two-sided ideal in A. Trivially, 
for any a G A, we have a — Co{a) ■ 1 a G ker(a)). If this element were invertible, then 
ker(a)) would contain the unit 1^ and hence would coincide with A, contradicting 
the definition of 2;(A) (which requires (O to be nonzero). Hence (o{a) G O’(a). By 
the spectral radius formula (B.255) we have |ft)(fl)| < ||fl||, whence co G A*. 

Furthermore, (o{1a)^ = ©(Ia)^ whence (b(1a) = 1 orO, the latter being excluded 
since it would imply that (o{a) = 0 for all a G A. This gives (C.14) (which also 
follows from Lemma C.4, given Lemma C.l 1 below), which in turn gives (C.13). □ 
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The Gelfand topology on E{A) coincides with the weak* topology inherited 
from A*, which is simply the topology of pointwise convergence (i.e. 0);^ ® iff 

(Oi{a) (o{a) for each a G A), and the Gelfand transform a i-G- a is (by abuse of 
notation) the image of a in A** under the canonical injection A ^ A** appearing in 
Proposition B.44, restricted (as a function on A*) to the subset Z (A) C A*. From this 
perspective, continuity of a immediately follows from Proposition B.46. 

This picture of the Gelfand topology also has a technical advantage, for we infer: 

Lemma C.IO. If A is unital, then its Gelfand spectrum 2^ (A) is compact Hausdorff. 

Proof By Lemma C.9, E{A) lies in the unit ball of A*, which by the Banach- 
Alaoglu Theorem is compact in its weak* topology. So we are ready if we show 
that E{A) is a weak*-closed subset of A*, which is obvious from its definition: if 
(Ox ft), then for any a G A we obviously have 

(o{ab) = \\m(Ox{ab) = lim(0x{a)c0x{b) = (o{a)co{b). (C.15) 

A A 

We know show that the Hausdorff property of E (A) is inherited from A*. A subbasis 
of its weak* topology is given by sets of the form 

U!{(p) = {pGA*,\(p{a)-p{a)\<e}, (C.16) 

where a G A, (p G A*, and e > 0. Replacing p G A* hy p G E{A) we thus obtain 
a subbasis of the Gelfand topology. If ft) and ft)' are distinct points in L(A), there 
exists a G A such that co{a) f (o’[a). Taking some 0 < e < |ft)(a) — ft)'(fl)|/2, the 
two points in question are separated by the opens U^{(o) and U^{(o'). □ 

It is immediate from the definition of E (A) that a i-G d is an algebra homomorphism, 
since we have 

ab{(o) = (o{ab) = (o{a)(o{b) = a{(o)b{(o) = (d • b){(o). (C.17) 

The fact that the Gelfand transform preserves the involution follows from: 

Lemma C.ll. If (O G E{A), then (o{a*) = ft)(a), and hence a* = (d)*. 

Proof Using (C. 14) and (C.2), the proof is the same as for Lemma C.7. □ 

The hard part of the proof of Theorem C.8 is isometricity of the Gelfand transform: 

||d||» = ||a||. (C.18) 

As always, isometricity obviously implies injectivity. Surprisingly, using the Stone- 
Weierstrass Theorem B.51, in this case isometricity also yields surjectivity of the 
map fl i-G d. Namely, if we take X = E (A), and B to be the image A of A under the 
Gelfand transform, then the conditions on B in Theorem B.51 are easily verified. As¬ 
suming (C.18), this image is obviously closed, so that A = C(E (A)). With injectivity 
also implied by (C.18), it follows that the Gelfand transform is an isomorphism. 
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It remains to prove (C.18), which conceptually is a conjunction of two equalities: 

llalloc = r(a); (C.19) 

||a|| = r(a) (a* =a), (C.20) 

where r{a) = sup{|A|,A G O’(fl)} is the spectral radius of a, see Theorem B.84. 
These immediately yield (C.18) for self-adjoint a, from which the general case fol¬ 
lows from (C.2), noting that a*a is self-adjoint for any a: asuming (C.19) - (C.20) 
as well as the homomorphism property of the Gelfand transform, we compute 

||d|li = ||d*fl||.>o = ||a*fl||<» = ||a*fl|| = \\af. (C.21) 

Since (C.20) just repeats (B.257), we already know it is true for general C*-algebras 
(so far, with unit). As we shall now show, (C.19) holds in any commutative Banach 
algebra with unit. The key is the following lemma. 

Lemma C.12. Let Abe a commutative Banach algebra with unit and let a € A. For 
any X G (7{a) there is an element CO G 2^(A) such that X = CO{a). 

Granted this, and using the proof of Lemma C.9 as well as (B.253), we obtain 

O’(a) = (J{d), (C.22) 

for any aGA. Given (B.254), this yields (C.19) and hence the Gelfand isomorphism. 

There are two approaches to our crucial Lemma C.12, each having its own merits. 
The first and best known proof, going back to Gelfand himself, relies on the theory 
of (maximal) ideals in Banach algebras. It is based on the following identification: 

Proposition C.13. Let A be a commutative Banach algebra with unit. There is a 
bijective correspondence between L{A) and the set ./XL (A) of maximal ideals in A, 

CO GG ker(co). (C.23) 

This will be proved in §C.8 below, which also contains the relevant background. 

It implies Lemma C.12, as follows: if A G cj(a), then by definition a — X is not 
invertible in A, so that J = {{a —X)b\b G A} is w ideal in A. By Zorn’s Lemma (or 
Hausdorff’s Maximality Theorem), applied to the partially ordered set of all proper 
ideals in A that contain J, ordered by inclusion), J is contained in some maximal 
ideal, so that J C ker((B) for some (O G ^(A). Since a — X G J (take b = 1^), from 
(C.14) we obtain (o{a) = X. Note the non-constructive nature of this argument! 

The other line of proof, due to Kadison, uses a different characterization of Z(A): 

Proposition C.14. Let A be a commutative C*-algebra with unit. Then the Gelfand 
spectrum Z(A) coincides with the pure state space P{A). 

Recall Definition 1.10 and Theorem C.6; the pure state space P{A) = deS{A) of a 
C*-algebra A is defined as the boundary of the state space of A. The argument that 
instantly delivers Lemma C.12 from Proposition C.14, then, is as follows: 
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Proposition C.15. Let A be a C*-algebra with unit. For any normal element a G A 
(i.e., aa* = a*a) and X G (7{a), there is a pure state CO G P{A) such that (o{a) = X. 

The proof of both results uses some positivity theory for C*-algebras, which is 
systematically developed in §C.7 below. Here, we just need that a G A is positive, 
written a > 0, iff a = b*b for some b GA, iff a is self-adjoint with a{a) C [0,oo). 

We write a>b or b<a if a —bis positive. Also, a linear functional O): A — C is 
called positive iff (o{a)>Q for all a > 0, and we write co>(p or (p<(0 if CO — (p>Q. 

Let us note that the proofs of these results in §C.7 use some Gelfand theory, but 
this use is limited to Theorem C.25, which could have been proved a la Theorem 
C.24, whose proof derives the Gelfand isomorphism in the special case at hand. 
Therefore, the use of Propositions C.14 and C.15 in the proof of (C.18) and hence 
of Theorem C.8 does not render this line of proof of the latter circular. 

In particular, the proof of Proposition C.14 relies on: 

Lemma C.16. If a* = a G A there is a number t >0 such that f ± a > 0. 

Proof. Since a{a) C K is compact (see Corollary C.27 and Theorem B.84), we have 
(y{a) C [—tf] for some f > 0. It is clear from the definition of a{a) that a{t ± a) = 
t ± which yields the lemma by the criterion for positivity just stated. □ 

We now prove Proposition C.14. 

Proof. It is clear from Lemma C. 11 and eq. (C. 14) that O) G Z(A) is a state. To show 
that CO is pure, we use the fact that for any state co G S(A), the expression 

{b,a) = C 0 {b*a) (C.24) 

defines an hermitian form on A; the easy proof again uses use Lemma C. 11. Apply¬ 
ing Cauchy-Schwarz with 1^ and using 1^ = 1 a = gives 

\co{a)\^ <C0{a*a). (C.25) 

Now suppose that co — Xcoi + (I — X)c 02 with cOi G S{A) and X G (0,1). Applying 
(C.25) (in the opposite direction) to Oi and 0)2 gives 

C0{a*a) > X\cOi{a)\^ + {I - X)\c 02 {a)\^. (C.26) 

On the other hand, multiplicativity of co gives 

co{a*a) = X^\coi{a)\^ + X{1 - X){coi{a)o} 2 {a) + co 2 {a)coi{a)) + {1 - X)^\c 02 {a)\^. 

Subtracting this from (C.26) gives the inequality 0 > A(1 — A)|cui(fl) — a) 2 (a)P, so 
that coi = © 2 , and hence co is pure by definition. This shows that 2^(A) C P{A). 

To prove the converse inclusion, we need another lemma. 

Lemma C.17. Let co G P{A) be a pure state on A. If x : A ^ C is a linear functional 
such that 0 < X < CO, then we can find a scalar s G [0,1] such that x = sco. 
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Proof. We assume x f Q and x f CO (otherwise the claim is trivially true). By 
Lemma C.16, this implies t(1a) f 0 and t(1a) f L For if t(1a) = 0, then for 
a* =awe find t as in Lemma C.16, so that t ±a > 0 and hence 0 < x{tzLa) = ±T(a). 
Hence x{a) = 0 on each self-adjoint a, which forces T = 0 by the usual decompo¬ 
sition (C.8). If t(1a) = 1, we apply a similar argument to the positive functional 
CO — X. Therefore, t = 1 — t(1a) satisfies t G (0,1), and defining tUi = (o) — x)/t 
and ©2 = x/x{\a) we obtain a decomposition co = tCOi + (1 — t)c02. Since co is 
pure, this gives ©i = ©2 = © and hence x — t(1a)©. Clearly, 0 < T < © enforces 
0 < t(1a) < 1, so the claim follows with s = t(1/i). □ 

We now prove that co G P(A) is multiplicative on arbitrary a G A, and b G A such 
that (for the moment) 0 < b < Ia- Define cOh : A ^ C hy COb{a) — Co{ab). Then 
0 < ©i < ©: taking b = c*c, the first inequality 0 < ©;, follows from 

COt,{a*a) = Co{c*ca*a) = co{{ac)*ac) > 0, (C.27) 

since A is abelian, and the second is analogous, using the fact that Q<b< I a implies 
0 < 1a —b < 1 a- Therefore, Lemma C.17 gives cOt = SCO with s = ©^(Ia) = Co{b). 

For general 0 f b>Q,we rewrite b as b= ||fo|| • (fo/||fo||), and use linearity of co 
and the previous result to obtain multiplicativity. For general self-adjoint b we use 
Lemma C.53, and finally we use (C.8). □ 

At last, we are now in a position to prove Proposition C.15, so let a G A be normal. 

Proof. Let C* (a) be the commutative C*-algebra generated by a (and hence a*) and 
1 a within A; as in Theorem C.25 below, this is the norm-closure of all polynomials in 
a and a*, and C{<j{a)) = C*{a) via the map f{X,X) 1 —>■ f{a,a*). Using Proposition 
C.14, define a pure state ©;l on C*{a) by linear and multiplicative extension of 
©;i(1a) = 1, ©A(a) =x, and ©a( a*) =X, i.e., C 0 x{f{a,a*))=f{X,X). 

Since ||©a|| = L Hahn-Banach (Corollary B.41, with V A and W C*{a)) 
yields a linear extension ©^ ; A — C of © 2 ^, which is in fact a state by Lemma 
C.4. To show that ©^ may be chosen to be pure also on A, let 5 a (A) C 5(A) be the 
set of all states on A that extend co^. This is a nonempty weak*-closed and hence 
weak*-compact convex subset of 5(A), which by the Krein-Milman Theorem B.50 
has nonempty boundary dgSi (A). It is easy to show that dgS^ (A) C <3e5(A) = P{A): 
for © G dgSi (A), suppose © = f ©1 -f (1 — f)© 2 , with t G (0,1) and cOi G 5(A). Since 
®|C*(fl) = is pure, we have ©i|c*(fl) = C02|c*{fl) = ®a, or ©,■ G 5 a(A). But © was 
assumed pure in 5 a (A), so that ©1 = ©2 = ©, i.e., © G deS{A). Hence if we choose 
co'^ G dgS^lA), then the extension ©^ of © 2 ^ is also pure on A. □ 

The following ingredients are still missing from the proof of Theorem C.8; 

• The proof the uniqueness of X up to homeomorphism (see §C.3). 

• The proof of Proposition C.13 (see §C.8). 

• The extension of the entire argument to the non-unital case (see §C.6). 

We start with the first issue, which we fill in more broadly than needed for the proof 
of Theorem C.8, namely, as part of a broader picture called Gelfand duality (which 
will fall into place if one uses the language of category theory, see Appendix E). 
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Theorem C.8 is a consequence of the following two propositions. 

Proposition C.18. Let A and B be unital commutative C*-algebras. Then 

(p = a*, (C.28) 

where (X*{o}) = (Ooa, establishes a bijective correspondence between unital homo- 
morphisms (X : A ^ B and continuous maps (p : Z (B) —> Z (A). 

In particular, Z{A) and Z{B) are homeomorphic iff A and B are isomorpic. 

Proof. Since a{ab) = a{a)a{b), if co € Z{B) it is clear that then «*(<») € Z(A). 

Conversely, denoting the pertinent Gelfand transforms by Ga : A —>■€(£ (A) ) and 
Gb'.A^ C{Z{B)), given <p ; Z{B) Z{A), we define a : A —^ B by 

a = Gs^o(p*oGA, (C.29) 

where (p* : C{Z{A)) C{Z{B)) is the pullback of (p (i.e., (p*{f) = fotp). 

It is easy to verify that given <p, the map a defined in (C.29) returns (p through 
(C.28), whereas given a, the map (p defined in (C.28) returns a through (C.29). □ 

Proposition C.19. For any compact Hausdorff space X, the evaluation map 

ev:X ^ Z{C{X)); (C.30) 

ev.(/) = fix), (C.31) 

is a homeomorphism, so that 

EiC{X))^X. (C.32) 

Proof. Injectivity of ev immediately follows from Urysohn’s lemma (which applies 
because a compact Hausdorff space is normal), which implies that C{X) separates 
points on X (i.e., for all x y there is an / G C{X) for which f{x) f{y)). 

To prove surjectivity, suppose there is O) G E{C{X)) such that co f for all x G 
X. Now ker(a)) = ker(evj:) would imply (O = evjc (because (o{f) = X then implies 
/ — A • lx G ker(a)), and hence f{x) = X, and vice versa), so ker(a)) f ker(eV;c). 
Since evx G ^{C{x)), and (O G E{C{x)) by assumption, by Proposition C.13 both 
kernels are maximal ideals mC{X), and hence ker((a) C ker(eV;f) is impossible (and 
so is the opposite inclusion). Therefore, for each x there is a function fx G ker((a) 
for which fx{x) f 0 (for otherwise /(x) = 0 for all / G ker((a), so that ker(a)) C 
ker(eV;f)). Redefining fx by a phase if necessary, we may assume that /^(x) > 0, and 
taking the real part of fx if necessary, we may also assume that / is real-valued. 

For each x, the set Ux where /;f > 0 is open, because / is continuous. This 
gives a covering {Ux}xex of E, which by compactness has a finite subcovering 
{Ux„}n=i,...,N- Then define the function 


N 

f=Y^fx, 


n=\ 


(C.33) 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



654 


C Operator algebras 


which is strictly positive by construction, so that it is invertible. But ker((B) is an 
ideal, so that, with all G ker(a)) (since all G ker(a))) also / G ker(a)). But 
an ideal containing an invertible element must contain lx and hence coincides with 
C{X), contradicting the fact that ker(a)) was maximal. Hence ev is surjective. 

Finally, to prove that ev is a homeomorphism, we equip X with the topology 
induced by ev, in which the open sets are of the form ev^'(t/), with U open in 
E{C{X)) in the Gelfand topology. We claim that this new topology on X is weaker 
than the original one (this terminology includes the possibility that the two topolo¬ 
gies in question coincide). Namely, for / G C{X) one has /oev = /. Therefore, since 
the Gelfand topology on E{C{X)) is the weakest topology for which all Gelfand 
transforms / are continuous, the new topology on X is the weakest topology for 
which all / are continuous. But / was already continuous with respect to the given 
topology, so the claim follows. Without proof we now state a result from topology: 

Lemma C.20. If a set X is Hausdorjf in some topology ^i(X) and compact in a 
topology ff 2 {X), and if I?i{X) C ff 2 {X), then ^i{X) = & 2 {X). 

Since X is in fact compact and Hausdorff in both topologies, we conclude from this 
lemma that the new topology on X must coincide with the original one. □ 

Uniqueness of the Gelfand spectrum up to homeomorphism follows from Proposi¬ 
tions C.18 and C.19: if A is a unital commutative C*-algebra for which A = C{X) 
as well as A = C{Y), then applying E and using (C.32) makes X and Y both home- 
omorphic to E (A), and hence to each other. 

With minor changes, the proof of Proposition C.19: applies also to “well- 
behaved” manifolds, by which we mean second countable smooth locally compact 
Hausdorjf manifolds. These are the ones encountered in physics (especially in clas¬ 
sical mechanics); we need this for Theorem 3.10 in the main text. Such manifolds 
admit partitions of unity subordinate to any given cover (L/) that are locally finite 
as well as countable, i.e., sequences of smooth functions Xn [0,1] such that: 

1. Each X GX has an open neighbourhood U that intersects only finitely many of 
the sets supp(x„); 

2. For each x G 2f we have Xn (-t^) = 1 (where the sum is finite); 

3. Each set supp(x„) is contained in some L/. 

Furthermore, E{C°°{X)) is defined as for any complex associative algebra A, i.e., 
as the set of nonzero multiplicative linear maps O): C°°(2f) —C. 

Proposition C.21. For any second countable smooth locally compact Hausdorjf 
manifoldX, the evaluation map ev : —>■ E(C“’(2f)) in (C.31) is a bijection. 

Proof Since X is not necessarily compact, we cannot use Urysohn’s Lemma di¬ 
rectly to prove that C°° {X) separates points of X (so that ev is injective), but this time, 
if t/ C X is open and F cU is closed, there exists a smooth function j [0,1] 

such that X = 1 on E and j = 0 on X\U. Indeed, {U,X\F} is an open cover of X, 
and if (XutXx\f) ^ partition of unity subordinate to this cover, X — Xu will do. 
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Now for x^y, take F = {x} and use the Hausdorff property to separate {x,y) by 
disjoint open sets {U,V), and we have x{x) = 1 whilst xiy) = 0. 

The proof of surjectivity is the same as for C{X), including the proof that ker(a)) 
is a maximal ideal in (7°{X), until the point (C.33) is reached. Here compactness is 
no longer available, so that we need to replace (C.33) by the expression 

f = 'L^nXnU, (C.34) 

n 

where {Xn) is a smooth partition of unity subordinate to the cover (t4), for each 
n G N, fx^ is picked by no. 3 in the list of properties of a partition of unity listed 
above, and the coefficients c„ are chosen so that 0 < c„ < (n^ 11 Xnfx^ 11 oc) ^ (note that 
Xn and hence Xnfxn has compact support and is continuous, so that it is bounded). 
Since ^„(1 /n^) < the insertion of the c„ makes / bounded and the sum (C.34) 
uniformly convergent, which is necessary to pull O) through the sum so as to prove 
that / G ker(a)), as follows. Since the sup-norm is not defined on all of C°°(X), we 
need a little argument here. Take t > |l/||oo, so that f • lx ±/ nowhere vanishes and 
hence is invertible, so that (»(f • lx ±/) = f ± ®(/) 0 by multiplicativity of /, 

i.e., ±(o{f) t. Since / and hence (o{f) is real, this gives |(b(/)| < ||/||oo. Since 
(o{fx„) = 0, and similarly for each finite sum in (C.34), we finally obtain 

l®(/)|= COif-f^CnXnU) < f-t^-XnU) . (C.35) 

71=1 71=1 

so letting N ^ gives Co{f) = 0, or / G ker(a)). Since / is invertible, this implies 
lx G kerjcu) and hence ker((B) = C{X), contradicting O) 7 ^ 0. □ 

Corollary C.22. LetX andY be compact Hausdorff spaces. Then (x{f) = fo(p, i.e., 

a^cp*, (C.36) 

establishes a canonical bijective correspondence between unital homomorphisms 
Ct : C{Y) —>■ C{X) (as C*-algebras) and continuous maps (p : X ^ Y. In particular, 
C{X) and C{Y) are isomorphic iffX and Y are homeomorphic. 

Likewise, X and Y are second countable smooth locally compact Hausdorff man¬ 
ifolds, eq. (C.36) gives a canonical bijective correspondence between homomor¬ 
phisms Ct: C°°{Y) —>■ C°°{X) (as commutative algebras) and smooth maps (p :X ^Y. 
In particular, C°°{X) and C°°{Y) are isomorphic iffX and Y are diffeomorphic. 

Proof. The passage from ^ to a is obvious. We write evx : X E(C{X)) and 
evy :Y —> Z(C(T)) for the bijections previously just called ev. Since these maps are 
invertible by the previous proposition, we may define a map (p :X -^Y hy 

(p — evy^ oa* oevx, (C.37) 

where a*: Z(C(7f)) -G Z(C(T)) is defined by a*(a)) = coo a; this lies in Z(C(T)), 
because a is linear and cc{fg) = a{f)a{g). Eq. (C.36) then holds by construction. 
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• In the compact case, we still need to prove that (p is continuous. To do so, note 

that a compact Hausdorff space Y is completely regular, and as such a subbase for 
its topology is given by sets of the form U = where / G C{Y) and U' G 

^(C). Hence (p^^(U) = {(p*f)^^{U'), and since we know that (p*f = a(/) G 
C{X), we conclude that (p^^{U) is open in X. Thus (p is continuous. 

• Similarly, in the manifold case, a map ^ ; X —T is smooth iff (p*f G C°°(X) for 

each f G C°°{Y)-, using localization by bump functions a la the first part of the 
proof of Proposition C.21, it is enough to prove this for open sets X C K" and 
Y C M"*, so that (p{y) = {(p^{y),... ,(p"{y)). Knowing that (p*f is smooth for each 
/ G C‘”(T), we simply take /(x*,... ,x") = to be the k’th coordinate function. 
This declares each (p'‘ to be smooth, and therewith also (p itself. □ 

We now state Gelfand duality, explaining its categorical interpretation in §E.l. 

Theorem C.23. 1. IfX is a compact Hausdorff space, then C{X) is a unital commu¬ 
tative C*-algebra. A continuous map (p :X ^Y induces a unital homomorphism 
C{f) = (p* ■C{Y) -G C{X), which behaves well under composition, in that: 

• If (p is the identity, then so is C((p). 

• If \j/ : Y ^ Z is another continuous map, then C{(p o\j/) = C{\j/) °C{(p). 

2. If A is a unital commutative C*-algebra, then £ (A) is a compact Hausdorff space. 
A unital homomorphism a : A B induces a continuous function £{(x) = (X* : 
E (B) —^ Z (A), which behaves well under composition in a similar way: 

• If a is the identity, then so is E{a). 

• If 15 : B ^ C is another unital homomorphism, then Z(j3 o a) = E(a) oZ(j3). 
5. There are canonical homeomorphisms and isomorphisms: 

evx :X4Z(C(X)); (C.38) 

Ga:A^C{E{A)), (C.39) 

with the following “naturality” properties: 

• IfE oC{(p) : E(C{X)) —^ E{C(Y)) is the map induced by (p :X ^ Y, then 

Z oC(^) oevx = evy o (C.40) 

• IfCoE(a) : C{E{A)) -g C{E(B)) is the map induced by (X : A B, then 

CoE{a) oGa = Gbo a. (C.41) 

Proof The proof is an assembly of previous results and routine verifications. □ 

In the language of category theory. Theorem C.23 states that the categories CH of 
compact Hausdorff spaces (with continuous functions as arrows) and CCAi of com¬ 
mutative unital C*-algebras (with unital homomorphisms as arrows, cf. Definition 
C.2) are dual (i.e., contravariantly equivalent). In particular, we have an adjunction 
between the functors C : CH -G CCAi and E ; CCAi -G CH. 
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C.4 Gelfand isomorphism and spectral theory 

As an example of Gelfand’s theory, Theorem 4.3 may be reformulated as follows: 

Theorem C.24. Let H be a Hilbert space, and let a = a* G B{H)sn, with associated 
(commutative) C*-algebra C*(fl) generated by a and In- The Gelfand spectrum 
Z(C*{a)) ofC*{a) is homeomorphic to under the mutually inverse maps 

E{C*{a)) ^ a{a), CO Co{a); (C.42) 

a{a)^EiC*{a)), X^cox:fia)^f{X). (C.43) 

In particular, the image of the map CO CO{a) from E{C*{a)) to C is (7{a), and the 
isomorphism C*{a) —>■ C((7(a)), f{a) i— f, of Theorem B.94 is obtained by com¬ 
posing the Gelfand transform f{a) i—>■ f{a) from C*{a) to C{E{C*{a))) with the 
isomorphism C{L{C*{a))) C{<j{a)) obtained by pulling back the map (C.43). 

Proof First, we note that map (C.43) is well defined. Indeed, it follows from 
(B.289) that the map coi : C*{a) —C is linear for any X € (j(a), whilst the fol¬ 
lowing computation, which uses (B.290), implies that a);Li7tultiplicative: 

(Ox(f(a)g(a)) = COx{fg{a)) = ifg){X) =f{X)g{X) = COi{f{a))cOx{g{a)). (C.44) 

Injectivity of the map X COx holds because a{a) is Hausdorff, so that f{X') = 
f(X) for each / G C((7(a)) implies X' = X. Surjectivity follows from (B.253), since 

<^C*(a){f{a)) = (yc(G(a))i.f) = Ran(/), (C.45) 

where we used invariance of the spectrum under isomorphisms. Consider the func¬ 
tion f{x) = X, so that /(a) = a. It follows from (C.43) that cOx(a) = X. Conversely, 
using the same function /, for given co € E(C*(a)) we find CO^^y^^j = CO, so that the 
maps in (C.42) - (C.43) are mutually inverse. It is clear from (C.42) - (C.43) dat 
COXi —COx in the Gelfand topology on E (C* (a)) (which is the topology of pointwise 
convergence) iff /(A,) f{X) for each / G C((7(a)), which is the case iff A,- — X 
on (7(a). Hence both of our maps E{C*{a)) gg (7(a) are continuous. 

The final claim is a definition chase, using the computation 

f{a){cOx) = COxifia))=f{X). □ 

If dim(//) < oo, one may replace this proof by using the fact that (7(a) consists of 
the eigenvalues of a. If p is a polynomial, then co G E{C* (a)) must satisfy co{p{a)) = 
p{co{a)). The characteristic polynomial pc of a, i.e., pc{x) = n"=i {Xt—x), where the 
Xi are the n = dim(//) eigenvalues of a (including repetitions), satisfies pc(a) = 0, 
so that co{pc{a)) = 0, i.e., n"=i('^! ~ ®(a)) = 0, and hence co{a) = Xi for some i, 
or co{a) G (7(a). Thus (C.42) is well defined. In the opposite direction, eqs. (A.53) - 
(A.55) show that (C.43) is also well defined, in that indeed cOx G E{C*{a)). 
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The construction of C*{a) as a C*-algebra within B{H) may trivially be gener¬ 
alized to arbitrary unital C*-algebras A, i.e., if a G A, we define C*(a) as the C*- 
algebra generated ( within A) by a and the unit 1^. If a = a*, then C* (a) still equals 
the norm-closure of the algebra of all polynomials in a, and hence C*{a) is once 
again commutative. Defining the spectrum a{a) as in Definition B.81, we then have 
the following generalization of Theorem C.24; 

Theorem C.25. Let A be a unital C*-algebra and let a* = a G A. Then 

L{C*{a)) ^ (7(a), 0) gg a)(a); (C.46) 

C*{a)^C{a{a)), f{a)GGf, (C.47) 

as spaces and as (commutative) C*-algebras, respectively. Under the Gelfand iso¬ 
morphism (C.47), the Gelfand transform a of a G C* (a) is the identity id^y^^j: A —^ A, 

whereas the Gelfand transform oflA G C*(a) is the unit ia{a) : 1- 

This continuous functional calculus may be proved in exactly the same way as 
Theorems B.94 and C.24, with B{H) A. However, these proofs did not invoke 
Gelfand’s Theorem (but rather derived it in the special case at hand), so it may give 
additional insight in the situation if we reprove Theorem C.25 from Theorem C.8. 

Proof We now assume the isomorphism C*(a) = C{E{C*(a))) via the Gelfand 
transform. According to (C.22) and (B.253), which imply (7(d) = ran(d), the func¬ 
tion d : E{C*{a)) — >■ C is surjective onto the spectrum (7(a) C C. We now prove 
injectivity. If (Bi,(B 2 G ■S(C’*(a)) and (Oi{a) = ( 02 (a), then, for all n G N, we have 

0)1 (a") = 0)1 (a)" = 0 ) 2 (a)" = 0 ) 2 ( 0 "), (C.48) 

Since also 0)i(1a) = 0)2 (1a) = 1, we conclude by linearity that 0 )i = 0)2 on all 
polynomials in a. By continuity (cf. Lemma C.9) this implies that 0)i = 0 ) 2 , since 
by definition the linear span of all polynomials is dense in C*(a). Using (C.12), we 
have therefore proved that d(o)i) = d(o) 2 ) implies 0)i = 0 ) 2 , i.e., d is injective. 

Since d G C(E(C*(a))) by Theorem C.8, d is continuous. To prove continuity of 
the inverse, recall that d : E(C*(a)) —)■ (7(a) is the map O) r-A 0)(a), so that for A G 
(7(a), the functional d^^(A) G E(C*(a)) maps a to A. By multiplicativity, d^'(A) 
then maps a" to A”. Hence ny linearity and (C.14), for polynomials p in a one has 

d^'(A) ; p(a) hG- p(A). (C.49) 

Since polynomials are continuous, if —)■ A in (7(a), then p{An) —>■ p(A), so 

{a^'-{A„)){p)^{ar^{A)){p). (C.50) 

Since such polynomials p{a) are dense in C*(a) by definition, and functionals in 
E(C* (a)), being continuous, are therefore determined by their values on polynomi¬ 
als, we conclude that d^*(A„) —d^^(A) pointwise. Since the Gelfand topology is 
the topology of pointwise convergence, we conclude that d^^ is continuous, so that 
d is a homeomorphism. This proves (C.46). 
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Finally, for compact Hausdorff spaces X and Y, a homeomorphism (p :X 
induces an isomorphism (p* : C{Y) ^ C{X) of C*-algebras, where (p{f) = fo(p (cf. 
§C.3). Theorem C.8 and (C.46) give (C.47). Unfolding the latter isomorphism gives 

c*(fl) -^c(i;(c*(fl))) c((j(fl)), (C.51) 

where GT is the Gelfand transform and (d^*)* is the pullback of the homeomor¬ 
phism : (7(a) — E{C*{a)), as in (p* above. Following these arrows and using 
(C.49), one obtains the last claim. □ 

Corollary C.26. Let A be a unital C*-algebra and let a* = a G A, with spectrum 

(7(a). For each selfadjoint element aGA and each f G C((7(a)), there is an operator 
f{a) G A, which is the obvious expression when f is a polynomial (and in general is 
given via the uniform approximation of f by polynomials), such that 

ll/(a)ll = ll/IU; (C.52) 

cy{f{a)) = f{a(a)). (C.53) 

Eq. (C.53) is called the spectral mapping property. Furthermore, the norm and 
spectrum of a as an element of A coincide with the norm and spectrum of a in 
C*{a). 

Proof We write (C.51) in the opposite direction, i.e., 

C((7(a)) C{E{C*{a))) ^ C*(a). (C.54) 

Indeed, if / G C{E{C*(a))) is the image of / G C((7(a)) under the first arrow, then 
f((o) = f((o(a)), and the second arrow says that /(a) = /. Together these give 
f((o{a)) = (o{f{a)), which by multiplicativity, linearity, and (C.14), is the case for 
polynomials f = p', the general case follows from the polynomial case by continuity. 

Eq. (C.52) follows from (C.18) and the fact that also the first arrow in (C.54) is 
an isometry, and (C.53) follows from (C.22), with with a /(a). 

To close, take / = id^yj-^j; then (C.52) gives ||a||A = r{a), cf. (B.257), whilst 
(C.18) gives ||a||c*(a) = r(a), too. Finally, (C.47) and (B.253) show that the spec¬ 
trum of a in C* (a) is (7(a), which by definition is its spectrum in A. □ 

Corollary C.27. If a* = a, then (7(a) C M. 

By Corollary C.26, we may take the spectrum of a in C*(a). By Lemma C.ll, the 
Gelfand transform a is real-valued. Then use the last part of Theorem C.25. □ 

Corollary C.28. The norm in a C*-algebra is unique (given all other structure). 
Using (B.257) for a = a*, and then (C.2), for arbitrary a G A we find 

||a|| = a/ r(a*a). (C.55) 

Since the spectrum (and hence the spectral radius r) is determined by the algebraic 
structure, (C.55) shows that the norm is determined by the algebraic structure. □ 
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C.5 C*-algebras without unit: general theory 

In classical physics, non-compact phase spaces are described by commutative C*- 
algebras without unit. Proper ideals in C*-algebras necessarily lack a unit, too. To 
set the stage, we first assume that A is a Banach algebra, and form the vector space 

A=A©C, (C.56) 

and turn this into an algebra in the obvious way, i.e., by means of 

(^G X' l^)(fi-fjU' 1^) = Gb -\-Xb~\~fJ.G-\-XfJ. • 1^, (C.57) 

where we have written a-f A • 1^ for (a,A), etc. This turns the number 1 in C into a 
unit 1^ for A, and this is the point; A is unital, even if A lacks a unit. Defining 

||fl + A-l^|| = ||a|| + |A|, (C.58) 

we also have a norm on A, with || 1^|| = 1. Using (C.l), (C.57), and (C.58), we have 

||(a +A • 1 + M • 1^)11 < ||«|| llfill-f |A| llfill + iMl ||a|| + |A| iMl 

= \\g + X- 1^11 Wb + n- 1^11, 

so that A is a EghgcH GlgebrG with unit. Since by (C.58) the norm of a G A in A 
coincides with the norm of a + 0 • 1^ in A © C, we have shown the following: 

Proposition C.29. For every Bonoch GlgebrG (with or without unit) there exists g 
unitGl BGUGch GlgebrG A, cGlled the unitization of A, Gnd on isometric (hence in¬ 
jective) morphism A —>■ A, such thot A/A = C. 

If A is a C*-algebra, (C.58) fails to be a C*-norm with respect to the involution 

(g + X-1^)* =G* +X-l^, (C.59) 

since (C.2) is not satisfied. Instead, the correct norm in which A © C is a unital 
C*-algebra is the one borrowed from B(A), i.e., the Banach space of bounded linear 
maps from A to A (regarded as a Banach space), relying on an embedding A C B(A): 

Proposition C.30. Let A be g C*-GlgebrG (with or without unit). 

1. The mop L: A ^ B{A), g i—^ La, given by 

La{b)=Gb (C.60) 

estGblishes Gn isometric isomorphism between A Gnd L(A) C B(A). 

2. When A hGS no unit, define g norm on A = A © C fiy 

\\G + X-lJ = \\La + X-lBiA)\\, (C.61) 

where the right-hond side uses the operGtor norm in B{A). With the operGtions 
(C.57) Gnd (C.59), the norm (C.61) turns A into g C*-GlgebrG with unit. 
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Proof. By (C.l) we have \\Lab\\ = \\ab\\ < ||a|| ||fo|| for all b, so that ||La|| < ||a||. 
On the other hand, using (C.2) and (A.22), assuming afQ,we can write 


ll«ll = ll««1l/IHI 



< \\La\\. 


(C.62) 


Hence 

||f-«|| = ||a||. (C.63) 

Being isometric, the map L must be injective; it is clearly a homomorphism, so that 
we have proved the first claim of the proposition. 

It is clear from (C.57) and (C.59) that the map a + X- lj^^ La+ ^b(h) ^ 

homomorphism. Hence the norm (C.61) satisfies (C.l), for this is satisfied in the 
Banach algebra B{A). In order to prove that the norm (C.61) satisfies (C.2), we 
note that if an involution on a Banach algebra A satisfies \\a ||2 < ||a*a|| , then A is 
a C*-algebra, because substituting a a* gives ||a*|p < ||flfl*|| < ||a||||a*||, i.e., 
||fl*|| < ll^ll^ so that ||a*a|| < ||a|p and hence \\a\\^ = ||fl*a||. 

Thus it suffices to show that for each a gA and A G C we have 


\\La+X-l^f< ||(L,+A-1^)*(L, + A-1^)||. (C.64) 

To prove (C.64), we note that by definition of the norm in B{A), for given T G 
B(A) and e > 0, there exists a b G A, with ||fo|| = 1, such that ||r|p — e < ||7’(fo)|p. 
Applying this with T — La + X ■ 1^, we infer that for every e > 0 one has 

||Ta + A • 1^ 11^ — e < II (La + A • l^)fo|p = ||afo + Afo||^ = ||+ Xb)*(ab + Afo)||. 

Here we used (C.2) in A. Using (C.60), the right-hand side may be rearranged as 

||L,.L„.^j,;,Wi/|| < ||L,*||||(T.+A-l^)*(L,+A-l^)||||fe||. (C.65) 

Since ||L*»|| = ||fe*|| = ||fe|| = 1 by (C.63) and (A.22), and ||fo|| = 1 also in the last 
term, the inequality (C.64) follows by letting e —0. □ 

Hence the C*-algebraic version of Theorem C.29, slightly supplemented, is: 

Theorem C.31. For every C*-algebra A, there is a unique unital C*-algebra A and 
an isometric (hence injective) morphism A -G A, such that A/A = C. Moreover, any 
homomorphism a : A ^ B extends to a unital homomorphism a : A ^ B by 

cc(a -GX • 1^) = cc(a) -GX • 1^. (C.66) 

Proof. Uniqueness of A follows from Corollary C.28; the rest is obvious. □ 

This is very important, if only for the following reason: 

Definition C.32. Let A be a C*-algebra without unit. Then the spectrum (7(a) of 
any a G A consists of all X G Cfor which the operator a — X is not invertible in A. 

Proposition C.33. If A has no unit, then 0 G (7(a) for any a G A. 
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Proof. If 0 ^ O’(fl), i.e., if a were invertible in A, then a ^ — b + jX - for some 
b G A and /r G C. Then 1^ = aa^* = ab + jxa G A. This is a contradiction. □ 

The spectral theory of compact operators provides a nice illustration of this proposi¬ 
tion: see Theorem B. 136.4. At the commutative end of the operator-algebraic world, 
we have the obvious fact that if X is not compact, no f G Co{X) is invertible. 

The construction of A through (C.56), (C.57), (C.59), and (C.61) also works ver¬ 
batim if A already has a unit 1a, in which case the spectrum a{a) of a G A may be 
compared with the spectrum a{a) of its image a = (a,0) in A. 

Lemma C.34. Let A be a C*-algebra with unit, embedded in A. For any a G A, the 
spectrum (7{a) in A is related to the spectrum (7{d) of its image a = (a, 0) in A by 

(7(fl) = (7(a)U{0}. (C.67) 

This will be important for the proof of the fundamental Theorem C.62 below. 

Proof Suppose OfzG p{a), so that b= (a — z- 1a) * exists and satisfies 

ab — zb = ba —zb = Ia- (C.68) 

Then b' — b -\- z^^ ■ (1a — 1a) satisfies ab' — zb' = b'a — zb' = 1a, so that b' = 
{a — z- 1a) * exists in A, and hence z G p{d). Conversely, if 0 z G p(d) with 
corresponding b' as before, then we first form b — b' — z^' • (1a — 1a)> which sat¬ 
isfies (C.68) but may not lie in A. If -f j3 • 1 a, where b" G A and j3 G C, this 
is remedied by redefining b"' = + j3 • (1a — Ia)^ which lies in A and is inverse to 
a—z -1 A- Furthermore, by the proof of Proposition C.33 with a d, we always 
have 0 G O’(d). If 0 G (j(fl), then the above argument gives a{d) = G{a), which is a 
special case of (C.67). If 0 ^ O’(a), then (C.67) follows as it stands. □ 

To close this section, we intoduce the technique of approximate units, which will 
play a decisive role in the theory of ideals in C*-algebras (see §C.9). Let us first 
give an example. For any noncompact space X, the C*-algebra Cq{X) has no unit 
(the unit would be lx, which does not vanish at infinity because it is constant). There 
is a certain substitute for the absentee unit, though. Taking X = M for simplicity, and 
pick a sequence of functions 1„, n G N, that take the value 1 on [—n, n] and vanish for 
\x\ > n-\- 1. It is clear that one does not have 1„ —1 r in the sup-norm, but instead 
one has lim„^.>o ||!«/ —/lU = 0 for all / G Co(R). More generally, one puts: 

Definition C.35. An approximate unit in a non-unital C*-algebra A indexed by 
some directed set A is a family {1;l}a(sa of selfadjoint elements of A, such that 

I|1aII<1, (C.69) 


and, for each a G A, 


lim II l;i_fl — a|| = lim ||fll;i — a|| =0. (C.70) 

A->0O A->0O 
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Here the limit is meant in the sense of convergence of the nets X i—|| — a|| and 

Ai—a||inK indexed by A (i.e., for each open neighbourhood t/ of 0 in K 
there is some Xu & A such that || l;^a — a|| G U for all X>Xu, etc.). 

Proposition C.36. Every non-unital C*-algebra A has an approximate unit { 1;l };Iga 
When A is separable, one may choose the directed set A countable (i.e. A = N ). 

Proof. One takes A to be the set of all finite subsets of A (or, if A is separable, from 
a countable dense subset of A), partially ordered by inclusion. Hence A G A is of 
the form X = {ai,... ,a„}, from which we build the element b^ = Y.iO*ai. Clearly 
bx is selfadjoint, and according to Theorem C.52 and Proposition C.51 one has 
<y{bx) C so that +bx is invertible in the unitization A of A. Take 

\x=bxin-'^A+hr'- (C.71) 

Since b*^ = b*^ and bx commutes with functions of itself like (n^'l^ +^3.) ^ one 
has 1^ = lx. Although (n^'l^ +^a) ' computed in A, so that it is of the form 
c + pl^ (for some c G A and p G C), one has lx = bxc + llbx, which lies in A. 
Using the continuous functional calculus (i.e. Theorem C.25) with /(f) = tl(n + t) 
on bx, one sees from (C.53) and the positivity of bx that 0’(l;t) C [0,1]. This implies 
(C.69) because of (B.257). Putting c, = IxOi — a,, a simple computation shows that 

'^CiC*=n-^bx{n-h^+bx)-\ (C.72) 

i 

We now apply (C.52) with a bx and /(f) = n ^t{n ^ + f) Since / > 0, and / 

assumes its maximum at f = 1 /n, one has sup,g]j+ |/(f)| = 1 /4n. As <y{bx) C 
it follows that ||/|loo < 1 /4n. Therefore, by (C.52) we have 

Wn-^bxin-^h+bxrH <l/4n, (C.73) 

so that ||D!CiC*|| < l/4n by (C.72). By Lemma C.37 below this implies that 
||qc*II — for each i = l,...,n. Since any a G A sits in some directed subset 
of A with n —oo, eq. (C.2) implies 

lim II Ixa — a\\^= lim || (Ixa — a)* Ixa — a\\ = lim ||c*c,|| = 0. (C.74) 

A—>-00 A—A—^OO 

The other equality in (C.70) follows analogously. □ 

In this proof we used the following lemma. 

hemma C.37. If a, b GA^ and \\a + b\\ <k,then\\a\\ <k. 

Proof. We first pass to the unitization A of A. By (C.83) we have a + b < klj^, 
hence 0 < a < kl^ —b by linearity of < (see Proposition C.51 below), which also 
implies that kl^—b< kl^, asO<b. Hence, using —kl^ < 0 (since k > 0), we obtain 
—kl^ < a < kl^, from which ||a|| < k by (C.84)). □ 
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C.6 C*-algebras without unit: commutative case 

We still owe the reader a proof of Theorems C .8 and C.23 for the nonunital case. 

In the commutative case, the unitization procedure has a simple topological 
meaning, which illustrates the general principle that the use of commutative C*- 
algebras often allows one to trade topological properties for algebraic ones. 

The one-point compactification X of a non-compact locally compact topological 
space X is the set X = X U 0 °, topologized by the open sets in X plus those subsets of 
X Uoo whose complement is compact in X. The injection i : X ^ X is continuous, 
and any continuous function f G Co{X) extends uniquely to a function f G C{X) 
satisfying f{°°) = 0. The space X is the solution (unique up to homeomorphism) 
of a universal problem: if (p :X ^ Y is a map between locally compact Hausdorff 
spaces such that Y\f{X) is a point and / is a homeomorphism onto its image, then 
there is a unique homeomorphism \j/ : X ^ Y such that (p = \j/oi. All this is true 
even when X is compact, in which case 00 is an isolated point of X. 

The unitization of Ca{X) corresponds to the one-point compactification of X: 

Lemma C.38. LetX be a locally compact Hausdorjf space. Then Co{X) = C{X). 

Proof. The map cx'.Co{X) ^C{X) given bycx(/+A-l^) = f-\-X-lx i^ obviously 
an injective homomorphism. To prove suijectivity, note that any f gC{X) assumes 
the form / = f + f{°°) ■ ix, where / = f — f{°°) ■ 1^ is such that fx 6 ^ 0 ( 2 ^). Thus 
our map is an algebraic isomorphism, which by Theorem C.62 is also isometric. □ 

Lemma C.39. Let A be a commutative C*-algebra, with unitization A. Then the fol¬ 
lowing map S/i '. X (A) Z (A) between their Gelfand spectra is a homeomorphism: 

1. Each CO G X(A) extends to a character 6) = sa{co) on A by 

(X>{aX\jf) = Co{a)X. (C.75) 

2. The following functional Ooo = sa(°°) on A is a character of A: 


C0ooia + XlA)=X. (C.76) 

3. There are no other characters on A (i.e. except Woo and CO, where CO G X (A )). 

Proof. Only the third part is nontrivial: any co' G X(A) restricts to X(A); if this 
restriction is zero, then co' = Woo, and if not, we have co' = CO with w = □ 

We are now in a position to prove Theorem C.8 also in the nonunital case. Ap¬ 
plying the unital case of Theorem C.8 to A and using Lemma C.39, one finds 

AeC=A^C(X(A))9^C(X(A))^Co(X(A)) = Co(X(A))eC. (C.77) 


Keeping track of all isomorphisms, the initial C is duly mapped to the final C (as 
befits an isomorphism of unital C*-algebras), and A is mapped to Co(X (A)). □ 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



C.6 C*-algebras without unit: commutative case 


665 


Next, we return to Theorem C.23. If X fails to be compact, the difficulty arises 
that a map (p : X Y does not, in general, pull back to a morphism (p* : Co{Y) —?> 
Co{X). For example, with Y equal to a point, any / G C{Y) = C pulls back to a 
constant function on X, which does not vanish at infinity. Hence some restriction is 
necessary on the class of allowed maps between locally compact Hausdorff spaces. 

Definition C.40. A map (p : X Y between locally compact Hausdorff spaces is 
proper when (p^^{K) is compact for any compact set K <zY. 

Without proof (since this is basic topology), we list some properties of proper maps. 

Lemma C.41. Let (p : X Y be a map between locally compact Hausdorff spaces. 

1. <p is proper iff it is closed and <p^^{pt) is compact for any point pt G Y. 

2. IfX is compact and (p is continuous, then (p is proper. 

3. IfY is compact and X is not, proper maps (p (trivially) do not exist. 

4. If (p is continuous, then (p is proper iff ^ :X ^Y, given by ^(x) = (p{x), x € X, 
and <p{°°x) = °°Y, is continuous (which is automatic ifX is compact, of course). 

5. The composition of two proper maps is again proper. 

The algebraic (or “noncommutative”) counterpart of a proper map is as follows. 

Definition C.42. A homomorphism a : A B between C*-algebras is called non¬ 
degenerate when a{A)B^ — B, in other words, if (x{A)B (i.e., the linear span of all 
expressions of the form a{a)b, a £ A, b G B) is dense in B. 

For example, any unital homomorphism between unital C*-algebras is trivially 
nondegenerate, and conversely, a nondegenerate homomorphism a:A~YB between 
unital C*-algebras is automatically unital. To see this, it follows from (C.4) - (C.5) 
that e = o:(1a) is a projection in B (i.e., e^ = e* = e), so that a{A)B C eB. Since 
B = eB (B (Ib — e)B as a vector space, a{A)B and hence eB can only be dense in 
B when e = Ig. Similarly, using an approximate unit in B it is easy to show that 
nondegenerate homomorphisms A^ B cannot exist if A is unital but B is not. 

This is a “noncommutative” version of the third part of Lemma C.41 above. 

Lemma C.43. Let (p : X -g Y be a continuous proper map between locally compact 
Hausdorff spaces. If f G Co(T), then f o(p £ Co{X), and the corresponding pullback 
(p* :Co{Y) ^ Co{X) is a nondegenerate homomorphism of C*-algebras. 

Proof. Let / G Co(T) and e > 0, giving a compact K <zY such that |/(y)| < £ for 
each y f K. Then K' — (p^^ (K) C X is compact, and | (p*f{x) | < £ for each x^ K'. 

For nondegeneracy, take g £ Co{X) and £ > 0; these yield a compact set L C X 
such that |g(x) | < £ for each xfiL. Then <p{L) C T is compact, so Urysohn gives us 
f £Cc{Y) with 0 < f{y) < 1 for each y £Y and f{y) = 1 for each y £ (p(L). Then: 

IK^Y) = sup{\f{(p{x))g{x) -g{x)\} < 2s. □ 

x^L 

The (commutative) C*-algebraic counterpart of this lemma is as follows: 
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Lemma C.44. Let a : A ^ B be a nondegenerate homomorphism between com¬ 
mutative C*-algebras. If CO G L{B), then cooa G ^(A), and the ensuing pullback 
a* : L{B) -O- L{A) is a continuous proper map between the two Gelfand spectra. 

Proof. Multiplicativity of O) o a is clear, as a is a homomorphism. If coo a were 
identically zero, then (since co is not), a{a) =0 for each a G A, which contradicts 
the assumption that a be nondegenerate. Continuity of a* follows from the fact 
that the Gelfand topology is the topology of pointwise convergence. Finally, in the 
present context, propemess of a* is most appropriately derived as follows: 

1. Use (C.66) to pass to a unital homomorphism a :A^ B. 

2. Theorem C.23.2 gives a continuous map (a)* : E{B) — >■ 2 ^ (A). 

3. Lemma C.46 below and continuity of sb and ' make (a*)’ continuous. 

4. Lemma C.4L4 then proves that a* is proper (and continuous). □ 

This suggests the following generalization of Theorem C.23: 

Theorem C.45. 1. IfX is a locally compact Hausdorff space, then Co{X) is a unital 
commutative C*-algebra. A continuous proper map (p : Y ^ X induces a non¬ 
degenerate homomorphism Co(/) = cp* : Co{X) ^ Cq{Y), which behaves well 
under composition (exactly as in Theorem C.23). 

2. If A is a commutative C*-algebra, then E{A) is a locally compact Hausdorff 
space. A nondegenerate homomorphism a : A B induces a continuous proper 
map E{a) = (X* : E{B) —^ E{A), which behaves well under composition, too. 

3. There are canonical homeomorphisms and isomorphisms, 

evx :x4z(Co(X)); (C.78) 

Ga :A4Co(2;(A)), (C.79) 

with similar naturalness properties as the corresponding maps in Theorem C.23. 

Categorically speaking. Theorem C.23 thus expanded states that the category LCHp 
of locally compact Hausdorff spaces and proper continuous maps is dual to the 
category CCAn of commutative C*-algebras and nondegenerate homomorphisms. 

Proof. Parts 1 and 2 are Lemmas C.43 and C.44, respectively; correct composition 
of the maps in question is easily checked (as simply as in the unital case). 

Eq. (C.79) has already been proved, cf. (C.77). Similarly, using Proposition C.19 
(with X -^X) and Lemma C.39 (with A Co(2f)), we have 

XUM = X E(C(X)) ^E{Cq{X)) ^E{Co{X)) = E(Co(X)) U 0)^. (C.80) 

Keeping track of the isomorphisms in question, it is easily verified that X and are 
mapped to E(Co(2f)) and (Ooo, respectively, and this proves (C.78). 

Naturality follows from the unital case (Theorem C.23) and the following lemma: 

Lemma C.46. 1. Let a : A ^ B be a nondegenerate homomorphism between com¬ 
mutative C*-algebras. Then the following diagram commutes: 
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t{B) > I.{B) 


Z(A) - 


|(ar 

Z(A), 


where sa and sb are defined in Lemma C.39, CC is defined in (C.66), and (o:*)' = ^ 
for (p — a* : E (B) —>■ Z (A), where the dot is defined as in Lemma C.4L4. 

2. Let (p :X be a proper continuous map between locally compact Hausdorff 
spaces. Then the following diagram commutes: 


CoiY) C(F) 


Co(^) - 


C{X), 


where Cx and Cy are defined in the proof of Lemma C.38, {(p*)' = a for a = (p* : 
Co{Y) —>■ Co{X) defined by (C.66), and (p : X ^ Y is defined in Lemma C.41.4. 

The proof is a diagram chase, but let us note that in clause 1 the role of nondegen¬ 
eracy is to ensure that a* (and hence (a*) ) is defined in the first place (cf. Lemma 
C.44). Similarly, in clause 2, the properness assumption on (p ensures that (p* (and 
hence {(p*)') is defined. Once defined, commutativity of these diagrams is obvious. 

Finally, the property that LCHp is indeed a category is trivial (as the identity maps 
id ; X —X are proper), but the corresponding fact for CCAn is not, for we need to 
show that the identity arrows id : A —A are nondegenerate. This comes down to the 
property that A^ = A • A is dense in A. In fact, the situation is even better; 

Lemma C.47. In any C*-algebra A one has A^ = A ( and hence A" = A, n G N). 

Proof. We prove that any self-adjoint a G A takes the form 


a = a\a 2 , 


(C.81) 


for suitable ai, fl2 G A. Since the linear span of such a is A, this proves the lemma. 

We assume A has no unit, for otherwise the claim is trivial. We then embed A C A 
and, for a* = a G A, consider C*{a) C A. We factor the identity function f i—f on 
<j{a) C M as f = /i (0/2(0 for some /■ G C{<j{a)), so that by Corollary C.26, we 
have (C.8 1) for a, = /(a) G C* (a). By the properties of the map / 1 —>■ /(a) mentioned 
in Corollary C.26, including the fact that f{la{a)) = 1a’ i*- follows that if /(a) =b + 
II ■ 1^ for some b GA and /r G C, then /(O) = /r; note that 0 G G{a) by Proposition 
C.33. Consequently, imposing the additional condition /(O) = 0 enforces a, G A. □ 

Corollary C.48. Each nondegenerate homomorphism a : Co(T) —>■ Co(2f) is in¬ 
duced by a proper continuous map (p : X ^Y via (X — (p*. 

Proof. Given (C.78), the proof is the same as for the compact case, cf. Corollary 
C.22. In particular, (p is given by (C.37), which map is proper because a* is proper 
by Lemma C.44 and evx, and evx are homeomorphisms. □ 
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C.7 Positivity in C*-algebras 

We now turn to the important notion of positivity. First, we give two examples: 

• An operator a G B{H) on a Hilbert space H is called positive when {\i/,a\i/) > 0 
for each xj/ G H. By Proposition B.99, this property is equivalent to a* = a and 
a{a) C M+, or to fl = b*b for some b G B{H), or to a =c^ for some c = c* G B{H). 

• A function f on some space X is called positive when f{x) > 0 for aWx GX. This 
applies, in particular, to elements of the commutative C*-algebra Co{X). 

These examples are not as dissimilar as they might appear at first sight: a G B{H) 
is positive iff its Gelfand transform id^jj^j = d is positive as a function in C(o'(a)); 
cf. Theorem C.24. Hence we have a notion of positivity for certain concrete C*- 
algebras, which we would like to generalize to arbitrary abstract C*-algebras. 

Definition C.49. An element a of a C*-algebra A is called positive when a = a* and 
its spectrum is positive; i.e., (7{a) C We write a > 0 when a is positive, and A^ 
for the set of all positive elements in A. 

The basic structure of A+ is captured by the following definition. 

Definition C.50. A convex cone in a real vector space V is a subspace such 
that: 

1. Ifv G and t G then tv G V^. 

2. Ifv,w G V^, then v + w G V^. 

3. y+n-y+ = {0}. 

A linear partial ordering in V is a partial ordering < in which v <w implies 
tv < tw for all t G as well as v + u < w + ufor all u GV. 

These structures are equivalent: A convex cone y+ C V defines a linear partial or¬ 
dering < by v^wifw — vG y^, and conversely, < yields = {v G y | 0 < v}. 

Proposition C.51. The setA^ of all positive elements of a C*-algebra A is a convex 
cone in the real vector space Asa, see (C.6). 

Proof Let a G A+. Property 1 follows from a{ta) = ta(a), which is a special case 
of (B.270). Since a{a) C [0,r(fl)], we have |c —A| < c for all X G G{a) and all 
c > r(a). Hence sup;^^^.^^^ |c • la{a) ~ ^{^)\ ^ c by (C.22) and Theorem C.24, i.e., 
Ik’ ^a{a) ~ q||~ ^ c. Gelfand transforming back to C*(a), by (C.18) this implies 

||c-U-a||<c, (C.82) 

for all c > Ikll- Inverting this, one sees that if (C.82) holds for some c > ||fl||, then 
O’(a) C R^. Use this with a a + b and c = ||a|| + ||fe||, so c > ||fl Then 

\\c-lA-ia + b)\\ < ||(||a||-a)|| + ||(||/,||-/,)|| < c, 

where in the last step we used the previous paragraph for a G A+ and b G A+ sep¬ 
arately. As for a, this inequality implies a + b G A+. Finally, when a G A+ and 
a G —A+ it must be that a(a) = {0}, hence a = 0 by (B.257) and Definition A.l. □ 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



C.7 Positivity in C*-algebras 


669 


For example, when a = a* one checks the validity of the important inequalities 

-||a||-U<a< llall-U, (C.83) 

by taking the Gelfand transform of C*(a). This also yields the implication 

-b<a<b \\a\\ < \\b\\, (C.84) 

because the antecedent and (C.83) with a -^ b yield — ||fe|| • 1 a < a < • ||^|| 1 a, so that 
O’(a) C [—ll^ll], hence ||a|| < ||/t|| by (B.257) and (B.254). 

We now come to the central result in the theory of positivity in C*-algebras, 
which generalizes the cases A = B{H) and A = Co{X) discussed at the beginning. 

Theorem C.52. With A+ = {a GA \ a>0} as in Definition C.49, one has 

A+ = {a^ \a* = a} (C.85) 

= {a*a\aGA}. (C.86) 

Proof. If (y{a) C and a = a*, then ^/a G A is defined by Corollary C.26 for 
/ = y/i, and satisfies = a. Hence A+ C {a^ | a* = a}. The opposite inclusion 
follows from (C.53) and Corollary C.27. This proves (C.85). 

Towards (C.86), the inclusion A+ C {a*a\ a G A} is trivial from (C.85). 

Lemma C.53. Each selfadjoint element a has a unique decomposition 

a = a+ —fl_, (C.87) 

where a^,a-G A^ and a^a^ = 0. Moreover, ||a±|| < ||a|| = max{||fl|| + , ||a||_}. 

Proof. Apply Corollary C.26 with / = id^j^^^ = f+ — f-, where idjy(^)(f) = t and 
f±{t) = max{±f,0}. The norm property follows from (C.52). Uniqueness follows 
from the corresponding property in C((7(a)), where it is obvious. □ 

Apply the lemma to a = b*b (noting that a is selfadjoint). Then 

(a_)^ = —a_(fl+ — = —a-b*ba^ = — {ba)*ba^. (C.88) 

Since C because a_ is positive, we see from (C.53) with /(f) = f^ that 

(a_)^ > 0. Hence — {ba^)*ba^ > 0. 

Lemma C.54. If—c*c G A^ for some c G A then c = 0. 

Proof. We can write c = d + ie,d and e selfadjoint, so that 

c*c = 2d^ + 2e^-cc*. (C.89) 

Now for any a,b G A one has 

(j(flfo)U{0} = (7(fofl)U{0}. (C.90) 
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This is because for Zy^O, invertibility of ab — z implies invertibility of ba — z\ indeed, 

{ba — zy^ =z^^{b{ab — z)^^a— U). (C.91) 

Applying (C.90) with a c and b c*, it follows that a{c*c) C implies 
<j{cc*) C hence (j(—cc*) C By (C.85) and Proposition C.51 (applied to 
Definition C.50.2), eq. (C.89) then implies that c*c > 0, i.e., a{c*c) C so that 
the assumption —c*c G A+ now yields (j(c*c) = 0. Hence c = 0 by Proposition C.51 
applied to Definition C.50.3. □ 

By this lemma, the last claim preceding it implies ba- = 0. As 

(fl_)3 = -(ba-)*ba- = 0, (C.92) 

we see that (a_)^ = 0, and finally a_ = 0 by Corollary C.26 with f{t) = tHence 
b*b = a+ G A+. Thus {a*a\a G A} C A+, which ends the proof of Theorem C.52. □ 

An important consequence of (C. 86 ) is the fact that inequalities ai < 02 for 
selfadjoint 01,02 are stable under conjugation by arbitrary elements b G A, so that 
01 < 02 implies b*a\b < b*a 2 b. This is because 01 < 02 is the same as 02 — 01 > 0, 
and hence by (C. 86 ) there is an 03 G A such that 02 — 01 = O 3 O 3 . But {a^bYa^b > 0, 
i.e., b*ab < b*a 2 b. For example, replace o in (C.83) by 0 * 0 , and use (C.2), yielding 
a*a < llolpl^. Applying the above principle gives the operator inequality 

b*a*ab< \\afb*b {a,bGA). (C.93) 

We note that the definition of a state implies that if a <b, then (o{a) < (o{b), so that 

C0{b*a*ab) < \\afco{b*b), (C.94) 

from (C.93). This is a key lemma for the GNS-construction (cf. Theorem C. 88 ). 

At last, we are also in a position to prove the fundamental Lemma C.4. 

Proof. If (0 is positive and a* = o, then (C.83) in the form ||o|| • l/i ± o > 0 gives 
(o{a) < ||o||a)(lA), and hence co{a) G K. For general o G A, eq. (C.8) then im¬ 
plies 0(0*) = co{a) (which may alternatively be proved from Lemma C.53). This, 
in turn, makes the form (C.24) hermitian. Cauchy-Schwarz then gives \(o{a )|2< 
(b(o*o)(»(1a), as in (C.25). Furthermore, if ||o|| < 1 then also ||o*o|| < 1 by 
(C.2), so that (C.83) gives 0(0*0) < o(1a). Combining these inequalities yields 
|o(o)| < o(1a), so o is bounded with ||o|| < o(1a); taking o = 1 a gives equality. 

Conversely, assume that ||o|| = o(1a) = L In proving that 0 ( 0 ) >0 whenever 
o > 0, we may also assume that 0 < o < 1 a. Then (C.7) shows that a = 0 ( 0 ) G R. 
Also, we have a{a) C [0, 1 ] and hence (7(1a — o) C [0, 1 ], which in turns implies 
0 < (1 a — o) < 1a, and hence || 1 a — o|| < 1, cf. (C.84). Then 

l-a<\l-a\ = |o(1a-o)| < ||o||||1a-o|| < I, (C.95) 

whence a > 0 , and hence 0 ( 0 ) > 0 . □ 
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C.8 Ideals in Banach algebras 

This section returns to general Banach algebras. It has two aims: it completes the 
(first) proof of Theorem C.8, and it prepares for the theory of ideals in C*-algebras. 

Definition C.55. Let Abe a Banach algebra. 

• A left ideal (right ideal) in A is a closed linear subspace J for which a G J 
implies ba £ J (ab £ J)for all b GA. 

• An ideal in A is both a left and a right ideal (i.e., a closed two-sided ideal). 

• A maximal ideal is a proper ideal J G A (i.e., J f {0} and J f A) that is not 
properly contained in any larger proper ideal. 

Thus an ideal is closed by definition. However, it is useful to know that if we omit 
the word ‘closed’ throughout Definition C.55, a maximal ideal J G A (defined in the 
purely algebraic sense) is automatically closed. Indeed, note that the closure 7 of 7 
cannot be A, since 7 does not contain any invertible element of A (otherwise it would 
coincide with A), and the set A* of all invertible elements in A is open (see the proof 
of Theorem B.84). Since 7 ^ CZ 3 ,nd J is iri3,xiin3,l, J — J. 

Furthermore, one often uses the fact that an ideal J that contains an invertible 
element a must coincide with A (since = 1a must then lie in 7, whence 7 = A). 

In the commutative case, left and right ideals are the same as ideals. For example, 
ifA = C{X) for a compact space X, then each closed subspace Y GX defines an ideal 

C{X;Y) = {fGC{X) |/(x)=0VxGT}. (C.96) 

Note that C{X-,Y) is indeed closed by definition of the sup-norm, and that 

C(X;y)5^Co(X\J')- (C.97) 

Proposition C.83 in §C.ll shows that all ideals in C{X) are of this form. It is not 
necessary to assume that Y is closed, but this assumption entails no loss of general¬ 
ity, since C(X;Y) = C(X;Y), where Y is the closure of Y. We will see that C{X-,Y) 
is maximal iff T is a point, and that all maximal ideals in C(X) are of this form. 

The next proposition is predicated on an elementary Banach space result: 

Lemma C.56. IfV is a Banach space and W is a closed linear subspace ofV, then 
the vector space quotient V/W is a Banach space in the “distance to W ” norm 

IKMIl = inf ||v-w||, (C.98) 

M>e.w 

where T :V -^V/W is the canonical projection. Also, ||t(v)|| < \\v\\for any v . 

Proof. First, (C.98) is well defined, for if t(v') = t(v), i.e., v — v' = w'gW, then 

||t(v')|| = inf{||v' — w||,w gW} = inf{||v' — w —w'|l,w GW} 

= inf{||v-w||,w GW} = ||t(v)||. 
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The axioms for a norm are easily verified, except positive definiteness: we have 
||t(v)|| = 0 iff inf{||v — w||,w G VT} = 0; hence there must be a sequence (w„) in W 
with V — w„ —>■ 0, or w„ —>■ v. Since W is closed, v G VT, so that t(v) = 0. For the last 
claim, eq. (C.98) yields || [t(v)|| < ||v — w|| for all w G W; take w = 0. 

There seems to be no natural proof of the completeness of V/W, but here is a 
trick: for any Cauchy sequence (t(v„))„ in V jW, find a subsequence {T{vn^))k with 
||t(v„,_^i)-t(v„J|| < 2 -^= for all k. Using induction in k, one finds a sequence («<-) in 
V with x{uk) = and — uj^W < 2^^. Hence u (since V is complete), 

and hence —>■ t(u) by continuity of T. Then also (t(v„))„ —^0. □ 

Proposition C.57. If J is an ideal in a Banach algebra A, then the quotient A/J is a 
Banach algebra with multiplication 

t{a)t{b) = x{ab). (C.99) 

If A is unital and J is proper, A/J is unital, with unit t(1/i) satisfying 

I|t(1a)|| = i. (C.ioo) 

Proof As far as the Banach algebra structure is concerned, first note that (C.99) is 
well defined: when ji,j 2 G J one has 

x{a + ji)T:{b + j 2 ) = x{ab + aj 2 +jib + jij 2 ) = x{ab) = x{a)x{b), (C.lOl) 

since a 72 + jib + 7172 G 7 by definition of an ideal, and x{j) = 0 for all 7 G J. 

To prove (C.l), observe that, by definition of the infimum, for given a G A, for 
each e > 0 there exists a j G J such that 

||T(a)||+e> ||fl + 7||. (C.102) 

For if such a 7 would not exist, then ||T(fl)|| < \\a + 7 II — e for all 7 G J, violating 
(C.98). On the other hand, for any 7 G J, it is clear from (C.98) that 

||T(fl)|| = ||T(fl + 7)|| < ||fl + 7ll- (C.103) 

For G A, choose e > 0 and 71,72 G J such that (C.102) holds for a,fo, and estimate 

\\xia)xib)\\ = ||T(a + 7i)T(fe + 72)|| = ||T((a + 7i)(fe + 72))|| 

< ||(a + 7i)(fo + 72)|| < ||a + 7i|| ||fo + 72|| 
<(||T(«)||+e)(||T(fe)||+e). (C.104) 

Letting e —0 yields 

Ma)xm < ||T(a)||||T(fe)||. (C.105) 

If A has a unit, t(1a) is a unit in A/J, cf. (C.99). By (C.103) with a = 1a and 
7 = 0 one has ||t(1a)|| < ||1a|| = 1- On the other hand, from (C.105) and (C.99) 
with b = 1 a and a G A\J, one derives ||t(1a)|| > 1. Hence (C.IOO) follows. □ 
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In a C*-algebra the last step is unnecessary, since a unit necessarily has norm one. 
In the commutative case, a nice example (with X and Y compact, as above), is 

C{X)/C{X;Y)^C{Y), (C.106) 

as two elements f,g of C{X) are identified in C{X)/C{X-,Y) when / —g G C(X;Y), 

i.e., when they coincide on Y. If one looks at C(X;F) as the kernel of the restriction 
map ky : C{X) —>■ C{Y), then ran(r}') = C(X)/ker(ry), which is just (C.106). 

We now prove Proposition C.13, which we unfold as: 

1. If (0 G E (A), then Ja = ker(a)) is a maximal ideal in A; 

2. coi CO 2 iff J(0i Jo>2^ 

3. Every maximal ideal J is of the form J = Ja,, for some co G E(A). 

For the first claim, 7® is an ideal since (O is multiplicative. To prove maximality, 
suppose7(0 C / c A for some ideal/. Then (o{I) is an ideal in C, so either a)(/) = {0} 
or co{I) = C. In the former case, I = Jm (since I C ker,o = J), in the latter, I = A 
(because for any a G A there is b G I such that (o{a) = (o{b), whence a — b G ker® 
and hence a — b G I, 01 a G b + I — I). Thus Jm is maximal. 

For the second, if (Oi{a) =c, then (0i(a — c- 1a) = 0 by (C.14), so ifker((Bi) = 
ker(a) 2 ), then also © 2(0 — c - 1 a) = 0 and hence ( 02 (a) = c = (0i (a). 

Finally, let 7 be maximal. Since J ^ A, there is a nonzero b G A, b ^ J. Form 

Jly = j ba -\-j\aGA^jGJ\- (C.l 07) 

Since A is commutative, Jj, is an ideal. Taking a = 0 gives J ^ Jb- Taking a = 1 a 
and 7 = 0 gives b G Jt, so that 7^ ^ J. Hence 7^ = A, as 7 is maximal. In particular, 
1a G Jb, so that 1a= ba + j for some a G A,j G J. Applying X'.A^AjJ gives 

t(1a) = 1a = t(H = T(fe)T(a), (C.108) 

because of (C.99) and t( 7) = 0. Hence T(a) = t(/>)^* in A/7. Since b ^ 0 was 
arbitrary, this shows that every nonzero element of A/7 is invertible. At this point it 
is therefore appropriate to invoke the Gelfand-Mazur Theorem: 

Theorem C.58. If every nonzero element of a unital commutative Banach algebra 
B is invertible (i.e., ifB is simple), then B = C as Banach algebras. 

Proof. Since (y{b) f 0, for each b ff) there is A G C for which b — X -Ib is not 
invertible. Hence b — X-\B — Qf>y assumption, and 1 —> A is an isomorphism. □ 

Hence there is an isomorphism \j/ : A /J ^ C, from which we define (O : A ^ C 
by (o{a) — v/(T(a)). This map is clearly linear (since T and \j/ are), and nonzero 
(because ©(1a) = !)• Also, Co(a)co{b) = Co{ab) by (C.99) and the fact that y/ is a 
homomorphism, so © G ^^(A). Finally, since ker(T) = 7 and \j/ is an isomorphism, 
7 = ker(©). This proves claim 3 above, and therefore Proposition C.13 also follows. 
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Definition C.55 verbatim applies to C*-algebras. One would expect that an ideal in 
a C*-algebra is required to be selfadjoint by definition, but this is unnecessary: 

Proposition C.59. Let J be an ideal in a C*-algebra A. If a G J then a* G J; in other 
words, every ideal in a C*-algebra is automatically selfadjoint. 

The proof (which generalizes a similar argument for compact operators, given at the 
end of §B.18) relies on the theory of approximate units (see §C.5). 

Proof Let 7 C A be the given ideal, and put J* = {a* \ a G J}. Note that j GJ implies 
j*j G 707*: it lies in 7 because 7 is an ideal, hence a left-ideal, and it lies in 7* be¬ 
cause 7* is an ideal, hence a right-ideal. Since 7 is an ideal, 707* is a C*-subalgebra 
of A. Hence by C.36 it has an approximate unit {1 ;l}- Take j G 7. Using (C.2), 

\\r-rhf = \\u-hj)u*-rh)\\ 

= liaV-r;U)|| + ||l;i|||l(;7*-;7*l;,)||, (C.109) 

since 1^ = 1^. As we have seen, j*j G 707*, so that, also using (C.69), both terms 
vanish for A —oo. Hence lim;^^„ ||7* — 7* II = 0. But 1;^ lies in 707*, so certainly 
1 ;l G 7, and since 7 is an ideal it must be that j*!^ G 7 for all X. Hence j* is a norm- 
limit of elements in 7; since 7 is closed, it follows that j* G 7. □ 

We now turn to a C*-algebraic analogue of Proposition C.57, which is of suffi¬ 
cient importance to promote it to the status of a theorem: 

Theorem C.60. Let J be an ideal in a C*-algebra A. Then A/J is a C*-algebra with 
respect to the norm (C.98), the multiplication (C.99), and the involution 

T(a)* = T(a*). (C.llO) 

The proof of this theorem uses approximate units, too. In view of Proposition C.57, 
all we need to prove to establish Theorem C.60 is the property (C.2). This uses: 

Lemma C.61. Let {l^} be an approximate unit for J, and let a GA. Then 

||T(a)|| = lim ||fl-fll;Lll- (C.lll) 

Proof It is obvious from (C.98) that 

\\a-a\x\\ > ||T(fl)||. (C.112) 

For the opposite inequality, add a unit 1^ to A if necessary, pick any j G 7, and write 

\\a-a\i\\ = ||(a-b7)(l - 1 a)+7(U - 1)11 < l|a + 7ll P - Ull + ll7U -7||. 

(C.113) 

Note that 

P-UII<1, (C.114) 
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by Definition C.35 and the proof of Proposition C.51. The second term on the right- 
hand side goes to zero for A — 0 °, since j G J. Hence 

lim lla-ami < ||fl + 7||. (C.115) 

For each e > 0 we can choose j G J so that (C.102) holds. For this specific j, we 
combine (C.112), (C.115), and (C.102) to find 

lim ||fl-al;i||-e < ||T(a)|| < ||fl-al;L||. (C.116) 

A^CO 

Letting e—>■ 0 proves (C.l 11). □ 

We now prove (C.2) in A/J. Successively using (C.lll), (C.2) in A, (C.114), 
(C.l 11), (C.99), and (C.l 10), we find 

||T(a)|p= lim \\a-ali\\^= lim ||(fl-fll;i.)*(a-al;i)|| 

A^-OO A^-OO 

= lim IKU- lA)fl*a(U- 1 a)II < lim ||1- IaII l|a*fl(lA - 1 a)II 

A—^00 A^OO 

< lim ||a*fl(lA- 1a)|| = IK(a*fl)|| = ||T(a)T(fl*)|| 

A^-OO 

= ||T(fl)T(a)*||. (C.117) 

As in the proof of Proposition C.30, this implies (C.2), and hence Theorem C.bO.D 
We now state and prove the key result about morphisms. 

Theorem C.62. Let a : A ^ B be a nonzero homomorphism between C*-algebras. 

1. The homomorphism a is continuous, with norm ||a|| = 1. 

2. Its kernel ker(a) is an ideal in A. 

3. If a is injective, then it is isometric. 

4. An isomorphism of C*-algebras is automatically isometric. 

5. The range (x{A) is a C*-subalgebra ofB; in particular, (x{A) is closed in B. 

Proof. If necessary, we first reduce the proof of the first claim to the case where 
A and B have units and a is unital: we do so by replacing A and B by A and B, 
respectively (even if A and/or B was already unital in the first place, but a was not), 
and replacing a by the homomorphism a : A B defined in (C.66). If we do so, 
it follows from Lemma C.34 that in the worst case the spectrum of a or a{a) is 
modified by adding 0, which does not change the spectral radius. Therefore, the 
move from a to a makes no difference to the argument to follow, so we assume 
that 1a G A and 1b G B, and o:(1a) = 1b- If z G p(fl), so that (a — exists in 
A, then a{a — z) is certainly invertible in B, for (C.4) implies that (a(fl —z))^* = 
a((fl — z)^'). Hence p{a) C p(a(a)), so that 

a (a (a)) c a (a). (C.118) 

Replacing a by a*a this gives r{a{a*a)) < r{a*a), and since a{a*a) = a{a)*a{a), 
eq. (C.55) yields ||a(fl)|| < IIaII, and hence ||o!|| < 1. This proves continuity of ct. 
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Recalling that ideals in C*-algebras have to be closed by definition, this also 
implies the second claim of the theorem (whose algebraic content is trivial). 

We now prove the third claim of the theorem (which trivially implies the fourth). 
Assume there is b €A for which ||o:(fi)|| ^ ||fi||, so that by a (a) ^ CT(a(a)) for a = 
b*bby (C.55). Then (C.118) implies the strict inclusion 0 '(a(a)) C O'(a) (as a closed 
subset). By Urysohn’s lemma, there is a nonzero function / G C((j(fl)) that vanishes 
on (7(a(fl)), so that f{a{a)) = 0 by Corollary C.26. By Lemma C.63 below, this 
implies a{f{a)) = 0. If a is injective, this contradicts the property /(a) ^ 0, which 
follows from / 7 ^ 0 and (C.52). Thus a must be isometric. 

Combining the second claim with Theorem C.60, we see that A/ker(a) is a C*- 
algebra. By the theory of vector spaces, we have a vector space isomorphism 

V/:A/ker(a)-^a(A), (C.119) 

so that 

V/oT = a. (C.120) 

Since a and t are homomorphisms between C*-algebras, so is \j/. Since \j/ is injec¬ 
tive, it is isometric, as we have just shown. Hence \j/{A/kev(a)) has closed range 
in B. But v/(A/ker(a)) = o:(A), so that a has closed range in B. Since a is a mor¬ 
phism, its image is a *-algebra in B, which by the preceding sentence is closed in 
the norm of B. Hence a (A), inheriting all operations in B, is a C*-algebra. 

Finally, we prove that for the projection T : A —7 A/7 in the case at hand we have 

||t|| = 1. (C.121) 

If A has a unit, this follows from Lemma C.56 with (C.lOO). If not, the argu¬ 
ment is similar, using an approximate identity (1;^) for A: from (C.105) we obtain 
Iim;i_ ||t(1;i,|| > I, which with (C.69) gives sup;^ IK(1 a|| = 1- Since ||t|| < 1 from 
Lemma C.56, this yields (C.121). 

Because \j/ is an isometry, it then follows from (C.120) that ||a|| = 1. □ 

Here we used a nice property of the continuous functional calculus (Theorem C.25): 
Lemma C.63. If a : A ^ B is a morphism, and a = a*, then 

fia{a)) = aifia)) (/ G Cia{a))). (C.122) 

Here f{a) and f{a{a)) are defined through Theorem C.25, cf. (C.118). 

Proof. The property is true for polynomials by (C.4), since for those functions, f{a) 
and f{a{a)) have their naive meaning. The general claim follows by continuity. □ 

Corollary C.64. Every ideal in a C*-algebra is the kernel of some homomorphism. 

Proof. This follows from Proposition C.59, since 7 is the kernel of i: A ^ A/J, 
where A/7 is a C*-algebra and T is a morphism by (C.99), and (C. 110). □ 
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C.IO Hilbert C*-modules and multiplier algebras 

In §C.5 we explained the minimal way of adding a unit to a C*-algebra that did not 
have one to begin with (although the procedure even works if it does). There is also 
a maximal way, which embeds a non-unital C*-algebra in its multiplier algebra. In 
our view, this maximal extension is actually more elegant and useful than the min¬ 
imal one, although the commutative case might give the oppositie impression; here 
(as we have seen), the minimal extension corresponds to the simple one-point com- 
pactification of the Gelfand spectrum, whereas the maximal one extends the latter to 
its awesome Cech-Stone compactification. In topology one may doubt if the latter is 
indeed the neater choice, but for many noncommutative C*-algebras the multiplier 
algebra comes naturally. For example, the C*-algebra Bo{H) of compact operators 
on a Hilbert space H is thereby turned into the C*-algebra B{H) of bounded ones. 

There are various ways of defining multiplier algebras. Although not strictly nec¬ 
essary, we offer the powerful entrance provided by Hilbert C*-modules, which are 
simultaneous generalizations of C*-algebras, Hilbert spaces, and vector bundles. 

Definition C.65. A pre-Hilbert C*-module over a C*-algebra A consists of: 

• A right A-module E, i.e., a complex linear space equipped with a bilinear map 
E xA ^ A, written {\j/,a) \j/a (where \j/ G E and a G A) such that 

{\irb)a = ^(ba). (C.123) 

• A map {,)a ■ E X E ^ A, linear in the second entry (the axioms below implying 
antilinearity in the first entry) that for all \j/,(p G E and b G A, satisfies 


{¥,<P)*a = {<P,V)a-, 

(C.124) 

(\l/,(pa)A = {W,(P)Aa; 

(C.125) 

{W,¥)a > 0; 

(C.126) 

{\I/,V)a =0 ^\i/ = Q. 

(C.127) 

It is useful to note that (C.124) and (C.125) imply that 

{\lfa,(p)A=a*{\lf,(p)A- 

(C.128) 

Lemma C.66. In a pre-Hilbert C*-module E over a C*-algebra A one 

has: 

{\I/,(P)a{(P,v)a < ll^f (v7,V^)a; 

(C.129) 

||(r,^)A|| < llrllll^ll; 

(C.130) 

llr«ll<llrllll«l|. 

(C.131) 

in which the following expression (which duly defines a norm on E) occurs: 

\M = \\{¥,¥)a\\^^^. 

(C.132) 
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Proof. To prove (C.129), we assume (p fO (otherwise, the claim clearly holds), so 
that also ||^|| > 0 by (C.127) and (C.132). Replacing ^ by ^/||^|| if necessary, (i.e., 
if II ^11 7 ^ 1 ), it is then enough to show that whenever ||^|| = 1 , we have 

{\I/,(P)a{(P,w)a<{w,V)a- (C.133) 

To this effect, we substitute (p{(p, ^)a — for \j/ in (C.126) and use (C.128), (C.124), 
and (C.125), and (C.93), the latter in form b*cb < ||c||fo*fo for any b and c > 0 in A. 
This gives (C.129). Eqs. (C.2), (C.124), and (C.129) then imply (C.130). Eq. (C.131) 
follows from (C.128), (C.93), (C.84), and (C.2). 

Einally, (C.132) defines a norm: scaling is clear, positive definiteness follows 
from (C.127), and the triangle inequality is easily derived from (C.130). □ 

Corollary C.67. The inner product on a pre-Hilbert C*-module is nondegenerate, 
in that y/ = 0 ijf (P)b — Ofor all (p € E. 

Proof. It follows from (C.129) that for any y/ G £, we have 

||V/|| =sup{||(v/,(P)b||,(P G£,||(p|| = 1}. (C.134) 

. We now come to the main definition. 

Definition C.68. A Hilbert C*-module over A is a pre-Hilbert C*-module over A 
that is complete in the norm (C.132). We also say that E is a Hilbert A-module. 

The three most straightforward examples of this concept, written “E ^ A”, are: 

• C*-algebras themselves: E = A with action {a,b) i-G ab and inner product 

{a,b)A = a*b. (C.135) 

By (C.2), the norm in E defined by (C.132) coincides with the original norm. 

• Hilbert spaces: E = H and A = C, acting on H by the given scalar multiplication. 

• Hermitian vector bundles S' over locally compact Hausdorff spaces X: here E = 
Co(X,S) consists of the continuous cross-sections y/ of S vanishing at infinity, 
A =C{X) has natural action on E given by (i//a)(x) = a{x)\j/(x), and the Co(X)- 
valued inner product is given by the hermitian structure < •, • on each fiber, 

(¥,(p}c(x) =x^<Y{x),(p{x) . (C.136) 

This implies a norm ||t/H = sup{|| v/'(x)||^^,x G X}, where ||v|||^ =< v,v > 4 . 

A Hilbert C*-module E ^ A defines a C*-algebra C'*'{E,A) that consists of all 
maps a : E ^ E for which there exists a map a* :E ^ E such that for all \i/,(p G E, 

{yf,a(p)A = {a*yr,(p)A. (C.137) 

Such maps are called adjointable . Eor example, if £ = A, as in the first example 
above, then any element a G A defines an adjointable map simply by left multipli¬ 
cation (i.e., a{b) = ab). If A has a unit, then this is it, whereas in the nonunital case 
there are (many) more adjointable maps on A ^ A. 
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We now show that adjointable maps on a Hilbert C*-module form a C*-algebra. 

Theorem C.69. 1. An adjointable map on a Hilbert A-module is automatically C- 
linear, A-linear (that is, (a\j/)b = a(\j/b) for all \j/ G E and b € A), and bounded. 

2. The adjoint of an adjointable map is unique, and the map a i-G a* defines an 
involution on the space C*{E,A) of all adjointable maps on E. 

3. Equipped with this involution, and with the usual operator norm on the Banach 
space E, the space C*{E,A) is a C*-algebra. 

4. For each a G C*{E,A) and \j/ G E, the usual bound \\a\j/\\ < ||fl|| || V/|| sharpens to 

{a\j/,a\j/)A < \\af{Y, W)a- (C.138) 

Proof The property of C-linearity is obvious, whereas A-linearity follows from 
(C.128): this gives {a(\lfb),(p)A = {a{fif)b,^)A, upon which Corollary C.67 yields 
the claim. A similar argument shows that a* G C*{E,A) when a G C*{E,A). 

To prove boundedness, fix xj/ G E and a G C*(E,A), and define T^:E^A by 
T\i/(p — {a*a\j/,(p)A. It is clear from (C.130) that ||r^|| < ||a*av/||, so that Ty, is 
bounded. On the other hand, since a is adjointable, one has T^r(p = (\j/,a*a(p)A, so 
that, using (C.130) once again, one has < ||a*a^|| llv^ll- Since £ is complete 

we may apply the Banach-Steinhaus Theorem B.78, which gives 

sup{ II £^,11, v/G£, II rll = !}<-. (C.139) 

It then follows from (C.132) that ||a|| < Uniqueness and involutivity of the ad¬ 
joint are proved as for Hilbert spaces; the former follows from (C.127), the latter 
in addition requires (C.124). The space C*{E,A) is norm-closed, since one easily 
verifies from (C.137) and (C.132) that if a„ — a, then a* converges to a*. As a 
norm-closed space of linear maps on a Banach space, C* {E,A) is a Banach algebra, 
so that its satisfies (C.l). To check (C.2), one infers from (C.132) and the definition 
(C.137) of the adjoint that \\a\\^ < ||a*fl||; using (C.l) and the argument leading to 
(A.22), one first obtains ||a*|| = ||a||, and subsequently ||fl*fl|| = ||a^||. 

Finally, it follows from (C.126), (C.86), and (C.137) that for fixed xjf G E, the 
map a i—>■ {\i/,a\l/)A from C*{E,A) to A is positive. Replacing a by a*a in (C.83) and 
using (C.2) and (C.137) then leads to (C.138). □ 

In our first example the C*-algebra C*(A,A) is usually called the multiplier alge¬ 
bra, denoted by M(A). If A has a unit, then M(A) = A, but in general M(A) is much 
larger than A, and obviously it always has a unit (given by the unit operator on A). 

Proposition C.70. For any commutative C*-algebra A we have an isomorphism 

M{A) 4 Cb{E{A))-, (C.140) 

a i-G d, (C.141) 

where, in terms of the Gelfand isomorphisms A = Co(£(A)), f '-G- f, we have 

^)=af. (C.142) 
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In particular, for any locally compact space X we have an isomorphism 

M{Co{X))^Cb{X), (C.143) 

where a €Cb{X) simply acts on f gCo{X) by a(f) =af. 

Proof. If A is commutative, then by Theorem C.69.1, any a G M(A) satisfies 


a{fg)=a{f)g = fa{g), f,gGA. (C.144) 


For any f,gGA and (OG E(A) such that (o{f) f 0 and Co(g) f 0, the second equality 
in (C.144) gives (o{a{f))/Co{f) — (o{a{g))/(o{g). Since (0 fO, there is at least one 
f G A for which (o{f) f 0, so that the function d : Z(A) —>■ C given by 


d{o}) 


®(/) 


Q(/)(to) 

/(®) 


(C.145) 


is well defined. Thus (C.142) holds by construction. Since a{f) G A, continuity of 
the Gelfand transform makes d continuous. Next, we estimate 

|fl(®)/(®)l = |a(/)(®)l < l|fl(/)ll~ = l|a(/)ll < l|fl|lll/ll> (C.146) 


where we used (C.145) and isometry of the Gelfand transform, cf. (C.18). Hence 


|d(a))| 


Q(to)/(ft)) ^ l|a|| 
f(C0) “l/MI’ 


(C.147) 


for any f G A, and co G E{A) for which Co{f) f 0 and ||/|| = 1. For those, we have 

inf{|/(®)r' I cu G E{A),coif) f 0, ||/|| = 1} = 

(sup{|/(cu)| I COG i:(A),a)(/)^ 0,11/11 = 1})-' = ||/||-'= 1, (C.148) 

again using ||/|| = ||/||oo. Together with (C.147), this gives |d(a))| < ||fl||, and hence 


||d|U<||«||. 


(C.149) 


In particular, d is bounded, so that the map (C.140) - (C.141) is well defined. This 
map has an inverse, as clearly any function d G Cb(E(A)) defines an element of 
M(Co(E(A))) by multiplication, and hence defines an element a G M(A) by the 
inverse Gelfand transform, cf. (C.142). □ 

Since an isomorphism of C*-algebras is isometric, we have ||d||a, = ||a||. This may 
also be proved directly from (C.149) and the converse inequality 

||a|| =sup{||fl(/)|| I/GA,ll/ll = l} = sup{||fl(/)||„ |/GA,||/|1„ = 1} 

= sup{||d/||oc I/GA, 11/11^ = 1} < lldll^. (C.150) 
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Most of this argument also works for the pre-Hilbert Co(X) module E =Cc{X) 
(whose completion is Co{X), of course), except for the inequality (C.149), which 
relies on boundedness of a (cf. Theorem C.69). This is lost if E fails to be complete, 
and we now merely obtain an isomorphism of algebras with involution; 

M{Cc{X))^C{X). (C.151) 

For a slightly different take on this, for a general C*-algebra A we define an un¬ 
bounded multiplier on A (seen as a Hilbert A-module) as a closed C-linear and 
A-linear map m : D{m) A, where D(m) is a dense right-ideal in A (in the algebraic 
sense, i.e., by exception we do not require an ideal to be closed). In general, the set 
UM{A) of all unbounded multipliers on A has little algebraic structure (like the set 
of all closed operators on a Hilbert space), but in the commutative case we have 

UM{Cq{X))'^C{X), (C.152) 

under the same identification as in (C.143). This means that any unbounded multi¬ 
plier on Cq{X) takes the form fg for some f gC{X), with domain 

Dif) = {g€Co{X)\fg€Co{X)}. (C.153) 

The argument is the same as in the proof of Proposition C.70 (except for bounded¬ 
ness), adding that fact that Cc{X) is a core for each /, in that its closure (defined as 
usual by the set of all g € Co(X) for which there is a sequence (g„) in Cc(X) such 
that gn—^g and fgn is Cauchy) is given by D{f)-, then /g„ —>■ fg (in the sup-norm). 
Let us return to the bounded case, concentrating on the multiplier algebra 

M(A) =C*(A,A). (C.154) 

Proposition C.71. There is an inclusion A ^ M{A), where A (seen as a subspace 
ofB(A)) acts on A (seen as a Hilbert A-module) by left multiplication. Moreover, A 
is an essential ideal in M{A), in having nonzero intersection with any other ideal. 

Proof. We first note that each map La', b ab {a,b G A) is adjointable, because 

{c,La(b))A = {c,ab)A =c*ab= {a*c)*b = {a*c,b)A = {La*{c),b)A, 

so that the adjoint of La is La*. Furthermore, = 0 iff a = 0, as can be seen by 
taking an approximate unit in A, or from Lemma C.47. Hence A C M(A), which is a 
proper inclusion iff A has no unit (since M(A) always has one, i.e. the unit of B(A)). 

Now let m G M{A) and a G A. Then (moa){b) = m{ab) = m{a)b, since m G 
C*(A,A) is A-linear. Hence ma = mo a G A, since m{a) G A. Since am = (m*a*)*, 
this argument shows that also am G A, making A an ideal in M(A). 

To see that this ideal is essential, we note (as a little exercise) that an ideal J C B 
in a C*-algebra B is essential iff bJ — 0 (i.e., bj — 0 for each j G J and some b G B) 
implies b = Q. Again by Lemma C.47, if m{ja) = 0 for each j G A, a G A, and some 
b G M(A), then b(c) = 0 for each c GA, and hence c = 0. □ 
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In general, one may compute M{A) as follows. If A and B are C*-algebras and 
£ is a Hilbert A-module, we say that a homomorphism a : B ^ C* {E,A) is nonde¬ 
generate if a{B)E^ = E, that is, if the closed linear span of all vectors of the type 
a(b)\l/, where b G B and xj/ G E, equals E. It can be shown (from the Cohen-Hewitt 
factorization theorem) that in this case one needs neither the linear span nor the 
closure to recover E, in that each each element of E literally factorizes: 

E = {a{b)\j/\bGB,\i/GE}. (C.155) 

Theorem C.72. Suppose A and B are C*-algebras, E is a Hilbert A-module, and 

a:B^C*{E,A) 

is a nondegenerate homomorphism. IfB is an ideal in a C*-algebra C, then a has a 
unique extension to C (which is injective ifB is essential in C and (X is injective). 

Proof. The idea is easy: write (p G E as (p = a{b)\i/ for some b G B and xj/ G E, cf. 
(C.155), and define the desired extension 

a:C^C*{E,A) (C.156) 

by 

a(c)(p = a(cb)xj/, (C.157) 

provided this is well defined (in which case a is clearly uniquely determined by a). 
Adjointability then also follows, since we may define a(c)* = a(c*), and compute 

{a{c)* a{b')xi/\a{b)xi/) B = {a{c*b')xi/\a{b)xi/)B = {xj/',a{c*b')*a{b)xi/)B 
= {xi/',a{b')*a{cb)xi/)B 

= {a(b')xj/,a{c)a(b)xj/)B. (C.158) 

Furthermore, it is easy to see that a is a homomorphism. Also, a{c) =0 for c G C 
implies a{cb) — 0 for each b G B', if a is injective, then cb = 0, and if B is an 
essential ideal in C, then c = 0, so that a is injective. 

To show that (C.157) is independent of the representatives b and xj/, we estimate 

\\a{c)a{b)xi/\\ =lim||a(ce;i.^)rl| =^im\\a{ce;,)a{b)xi/\\ 

A A 

<lim||a(ce;t)||||«(^)rl| < lim||ca||||«(^)rl| 

A A 

= ||c||||||a(l7)v/||, (C.159) 

where (e^) is an approximate unit in C. In particular, if a(b)xj/ = a{b')xi/', then 

a{c)a{b)'^f = a{c)a{b')y. □ 

This proof works also without (C.155); one then has a finite sum (p = 

and a computation similar one to the previous one shows that a{c) is bounded on 

the dense subspace of E consisting of such sums. 
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This theorem (with B A and E A) explains in which sense M{A) is a max¬ 
imal unitization of A (whereas A is a minimal one); all we need to do is abstractly 
define a unitization of a non-unital C*-algebra A as a unital C*-algebra containing 
A as an essential ideal (cf. Proposition C.71). This incorporates both A and M(A), 
each being distinguished by a universal property it satisfies, namely: 

Corollary C.73. For each unital C*-algebra C containing A as an essential ideal, 
there are unique injective homomorphisms: C M{A) and A —>■ C whose restriction 
to A is the identity map. In other words, denoting the inclusion of A into C by I, we 
have commutative diagrams 

A -> A A -^ M(A) 

'•I 

C 

The topological counterpart of this corollary is the construction of the one-point 
compactification X and of the Cech-Stone compactification j3X, respectively; cf. 
Lemma C.38, which we may now supplement by simply defining pX as the Gelfand 
spectrum of the commutative C*-algebra Cti{X) = C{j3X). In this analogy, the con¬ 
dition on an ideal B C C to be essential simply corresponds to a non-compact space 
X being a dense subspace of some compactification of it. 

Corollary C.74. Let E be some Hilbert A-module E and let a : B ^ C* {E,A) be an 
injective nondegenerate homomorphism. The unique extension a : MiB) — >■ C* {E,A) 
of a that exists according to Theorem C.72 maps M{B) isomorphically onto 

Za{E) = {fl G C*{E,A) I aa{b) G a{B),a{b)a G a(B) Vfo G B}. (C.160) 

Proof. Note that Za{E) is essential in C*{E,A), as easily follows from the nonde¬ 
generacy of a. Therefore, by the argument just given (plus the abstract nonsense 
that shows that universal objects are unique up to isomorphism), we only need to 
prove that Za{E) is a maximal unitization of B. Let B be an essential ideal in C 
and consider the injective extension a:C ^ C*{E,A) of a given by Theorem C.72. 
Then a maps C into Za{E) by construction, as a{c)a{b) = a{bc) G a{B), etc. □ 

Corollary C.75. A nondegenerate homomorphism a : B M{A) has a unique ex¬ 
tension to a homomorphism a : M{B) M{A). 

Proof. Take C = M(B) and £= A in Theorem C.72.1. □ 

Note that two nondegenerate homomorphisms a : A — M{B) and j5 : B ^ M{C) 
can be composed into a nondegenerate homomorphism j3 o a ; A —^ M{C), which 
by definition equals fioa. Thus one obtains a category CAm whose objects are C*- 
algebras and whose arrows are nondegenerate homomorphism a : A — M{B), with a 
full subcategory CCAm whose objects are commutative C*-algebras (with the same 
arrows). This leads to a neat extension of Gelfand duality (cf. Theorem C.45); 
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Theorem C.76. The category LCH of locally compact Hausdorff spaces and contin¬ 
uous maps is dual to the category CCAm of commutative C*-algebras just defined. 

This claim may be unfolded as in Theorem C.45, omitting ‘proper’ on the topologi¬ 
cal side and replacing a :A ^ B on the algebraic side by a : A —> M(B). 

Proof First, a continuous map (p : Y ^ X trivially induces a nondegenerate ho¬ 
momorphism (p* : Co{X) ^ Ch{Y). Second, since (O G X(B) defines a nondegen¬ 
erate homomorphism B -4 C, by Theorem C.72 it extends to a homomorphism 
CO : M(B) —> C. Thus the pullback a* : X{B) -g E{A) of a nondegenerate homo¬ 
morphism a : A —s> M{B) is well defined (and still continuous). Part 3 of Theorem 
C.45 stays the same, and the pertinent naturality properties are easily verified. □ 

Corollary C.77. A nondegenerate homomorphism a : Co{X) ^ B{H) has a unique 
extension to a homomorphism a : Ci,(X) -4 B{H). 

Proof Taking A = C, E = H, and B = Bo{H), Theorem C.72.2 gives 

M{Ba{H))^B{H). (C.161) 

Combine this with the previous corollary (with B Co{X) and A Bo{H)). □ 

Finally, we show how to reconstruct A as a C*-algebra from A as a Hilbert A- 
module. The key to this is a more general construction; 

Definition C.78. The collection Cq{E,A) of “compact” operators on a Hilbert A- 
module E is the C*-algebra generated (within C*{E,A)) by all operators of the type 
\(p){\ir\, where (PjXjf G E, and 


\(P){¥\{Q = (P{¥,Qa- (C.162) 

Such operators are easily seen to be adjointable, with adjoint 

\(P){V\* = \V){(P\, (C.163) 

and hence bounded, with norm majorized by || v/|| || ^ ||. If £ = // is a Hilbert space, 
then Cq(//,C) = Bq{H), since the maps \(p){\f/\ obviously generate the finite-rank 
operators on H, whose norm-closure is Bq{H), cf. Proposition B.131. Hence the 
name “compact” operators, but in general elements of Cq (£, A) need not be compact 
(as operators on a Banach space) at all. The next and final example is a case in point: 

Proposition C.79. IfE = A as a Hilbert A-module in the usual way, then 


Co (A, A) ^ A. (C.164) 

Proof We have \a){b\ = Lab*, where a ^ La is the canonical map from A to 
C*(A,A) C B{A) given by La{b) = ab, see Proposition C.30. This map is isomet¬ 
ric, cf. (C.63), and hence injective. The map \a){b\ i— ab* from the linear span of 
all operators (C.162) within Co(£’,A) to A is therefore bounded, and has dense im¬ 
age by Lemma C.47. Its unique continuous extension maps Cq{E,A) onto A, see 
Theorem C.62.5 (or use the Cohen-Hewitt factorization theorem to conclude). □ 
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C.ll Gelfand topology as a frame 

In the traditional approach to the Gelfand isomorphism, which we have followed 
so far, the Gelfand spectrum E(A) of a commutative unital C*-algebra A is first 
constructed as a set, upon which it is equipped with a natural topology ff{E(A)), 
i.e., the Gelfand topology. Alternatively, one may start with the latter and reconstruct 
E{A) as a set from it. This not only gives a better conceptual understanding of 
Gelfand’s theory (relating it, for example, to a well-known construction in algebraic 
geometry); it also has the technical advantage of making good sense in constructive 
mathematics and hence in topos theory (which the classical theory does not). 

In the language of lattice theory, the topology ff{X) of any space X is an example 
of a so-called frame (cf. Appendix D, compared to which we change notation so as 
to avoid abuse of the ubiquitous symbol X) i.e., a complete lattice L in which 

Ua\/S = \/{UAV,V €S}, (C.165) 

for arbitrary elements U € L and subsets S C L. This is sometimes written in the 
form U A (Vi Vx) = \/x {U AVx), from which it is clear that the (binary) distributive 
law t/ A (y V W) = (t/ A y) V (t/ A W), which of course is implied by (C.165), is 
now required for arbitrary families. Indeed, the definition of a frame is primarily 
motivated by the example L= ^{X),m which it should be noted that the supremum 

\/S = \JS=\J{Ux€S}, (C.166) 

A 

is simply given by the set-theoretic union of the elements of S, which are open sets 
whose union is open by definition of a topology, whereas the infimum of arbitrary 
families of open sets has to be doctored so as to make it open, and hence is given by 

A5=V{t/G ff(x) I u c y vy G s}. (C.167) 

Frame maps, then, are defined as order-preserving maps between the underlying 
posets that preserve finite infima and arbitrary joins. For example, if 

(p:Y->-X (C.168) 

is a continuous map, then the inverse image map 

(p-^ : ff{X) ff(Y) (C.169) 

is a frame map. This also defines the category Frm of frames, whose opposite cat¬ 
egory (that has the same objects but all arrows inverted) is called the category Loc 
of locales. Thus a locale is a frame, seen as an object in the opposite category. If 
no confusion arises (which, unfortunately, is rarely the case), elements of Frm are 
written as ff{X), even if they are not topologies (and indeed there are such frames, 
see below), in which case the corresponding element of Loc is written as X. 
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In this spirit, frame maps are always written as (C.169), in which case the map in 
the opposite direction between the corresponding locales is (C.168). This notation 
suggests the right way of thinking, and we will use it whenever it is convenient. 

Frames are very closely related to Heyting algebras, which were originally meant 
to formalize the intuitionistic (propositional) logic of Brouwer, and are defined as 
distributive lattices L (with top T and bottom _L) equipped with a binary map 

-^iLxL^L, (C.170) 

playing the role of implication in logic, that satisfies the axiom 

U <{V {U AV)<W. (C.171) 

Every Boolean algebra is a Heyting algebra, but not vice versa', in fact, a Heyting 
algebra is Boolean iff -i-if/ = U for all U, which is the case iff {-•U) V t/ = T for all 
U (which states the law of the excluded middle denied by Brouwer). In a Heyting 
algebra (unlike a Boolean algebra), negation is a derived notion, defined by 


-.U = U^±. (C.172) 

A Heyting algebra is complete when it is complete as a lattice, in that arbitrary 
suprema (and hence also infima) exist. The infinite distributivity law (C.165) is au¬ 
tomatically satisfied in a complete Heyting algebra, which therefore is also a frame. 
Conversely, a frame may be turned into a complete Heyting algebra by defining 

V = y{U \ U AV <W}. (C.173) 

Frames and complete Heyting algebras drift apart as soon as morphisms are con¬ 
cerned, for although in both cases one requires maps to preserve the partial order, 
maps between Heyting algebras must preserve —> rather than infinite suprema. 

The map X i—^(A) from topological spaces to frames (which extends to a con- 
travariant functor in the obvious way, i.e., via (C.168) - (C.169)) is a competitor to 
the map X i—Co(7f) from topological spaces to commutative C*-algebras, and one 
goal of this section is to find out how these two constructions are related. 

First, there is a frame-theoretic analogue of the categorical duality between lo¬ 
cally compact Hausdorff spaces and commutative C*-algebras (cf. Theorem C.45), 
in which locally compact Hausdorff spaces are replaced by so-called sober spaces 
(and no restrictions on continuous maps are made), whilst the category of frames 
must be restricted to so-called spatial frames (which move is somewhat analogous 
to restricting C*-algebras to commutative ones). We now explain these notions. 

A particularly simple frame is 2 = {0,1} = {_L,T}, with order 0 < 1; this is just 
the topology of a singleton *. In agreement of the above convention, a frame 
map p^^ ff{X) ^2 will be written as a locale map p : * ^ X. Such a map defines 
a point of the locale X (i.e., of the frame ^{X)), and we denote the set of points of 
X by Pt(A). To appreciate this definition, let us suppose that ff{X) is the topology 
of some space X. Each point x GX then corresponds to a genuine map 
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Px : * ^ X, Px(*) = x; (C.174) 

whose inverse image map p^^ : ff{X) ^ 2 is frame map and hence defines a point 
in the above sense. Conversely, if X is sober (see below), each point of i^{X) arise 
in that way. The set Pt(X) has a natural topology, with opens 

Pt(t/) = {p€ Pt(X) I p(*) G U}, (C.175) 

where U G ); here p{*) G U really means p^^{U) = 1. This gives a frame map 

t/^^Pt(^/) (C.176) 

from ff(X) to Pt(7f). We say ff{X) (or the locale X) is spatialspatial if this map 
is an isomorphism of frames. Roughly speaking, therefore, spatial frames are just 
topologies (an example of a non-spatial frame is the lattice ^reg (^) of regular open 
subsets of K, i.e., of open subsets U with the property -i-it/ = U, where -^U is the 
interior of the complement of U). This does not mean, however, that any topology 
ff{X) (seen as a frame) is isomorphic to ff{P\.{X)), since Pt(7f) may not be homeo- 
morphic to X. 

Spaces X for which this is the case are called sober, more precisely, this means 
that the map px from X to Pt(7f) considered above is a homeomorphism; less 
precisely, we may say that sober spaces X may be reconstructed from their topology 
ff{X), up to homeomorphism. To give a more direct topological characterization 
of sobriety, call W G ^{X) meet-irreducible if t/nV CW (where U,V G ^{X)) 
implies either f/ C W or V C W. In any space X, all open sets of the form Wx = X\x^ 
are meet-irreducible, where x GX (and x^ is the closure of {x}). A space X is sober, 
then, iff these are the only such opens. For example, any Hausdorff space is sober 
(an example of a non-sober space is A = N with the unusual topology in which all 
complements of finite subsets are open, along with the empty set, of course). 

The category Frm, then, has a full subcategory Spat of spatial frames, whilst 
likewise the category Top of topological spaces has a full subcategory Sob of sober 
spaces. We now have the following counterpart of Theorem C.45: 

Theorem C.80. The categories Spat and Sob are dual, in that: 

1. IfX is a sober space, then i^{X) is a spatial frame. A continuous map (p :Y ^X 

induces a frame map : ffiX) —>■ ff{Y) in the natural way, such that if we 
have another continuous map \p \ Z ^Y , then {(p o o (p^K 

2. If 0'{X) is a spatial frame, then Pt(A) is a sober space. Furthermore, a locale 

map <p : Y ^ X (i.e., a frame map (p^^ : 0‘{X) —>■ ^(T)) induces a continuous 
function (p* : Pt(T) —^ Pt(7f) by (p*{p) = <pop (i.e., (p*{p^^) = o^^*), which 

similarly behaves well under composition. 

3. There are canonical homeomorphisms and frame maps: 

Px:X^ Pt(^(X)), X ^ px; (C.177) 

Ptx : ^iX) ^ ^(Pt(^(A))), U hG Pt(t/), (C.178) 
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cf. (C.174) - (C.176), with the correct naturality properties (cf. Theorem C.23). 
Proof. We will not give a complete proof of this, but the main points are that: 

• Any locale defined by the topology of some space is spatial. 

• Any space Pt(A) of points of some locale X (not necessarily a space) is sober. 

• The map px in (C.177) is a homeomorphism by definition of sobriety (which, 
alternatively, could have been defined by requiring bijectivity of px, in which 
case it can be shown that this map is continuous as well as open). 

• By definition of the topology on Pt(A), the map (C.176) is surjective for any 

locale X. If X is spatial; then for any distinct elements U,V G ff{X) there is a 
point p such that (t/) f (V), but this is the same as saying that Pt(t/) = 
Pt(y) implies U = V.So in that case, (C.176) is also injective. □ 

Our aim is to apply these ideas to Gelfand duality, specifically to an independent 
description of the topology of the Gelfand spectrum E{A) of a commu¬ 

tative C*-algebra A. To put this in perspective, let A for the moment be a general 
C*-algebra, and recall Definition C.55 of left, right and two-sided ideals (all taken 
to be closed by definition). Further to these, there is another interesting notion. 

Definition C.81. A hereditary subalgebra of a C*-algebra A is a C*-subalgebra B 
of A with the property that a < b for b G and a G A+ implies a G B^. The set of 
of all hereditary subalgebras of A is denoted by H{A). 

It is a simple exercise to show that there are bijective correspondences between 
hereditary subalgebras B of A, left ideals L of A, and right ideals R of A, given by: 

L= {aGA\a*aGB+y, (C.179) 

R = {aGA\aa* GB+y (C.180) 

B = LdL* =RnR*. (C.181) 

Furthermore, one has I{A) C H{A), where I{A) is the set of closed two-sided ideals 
in A, and likewise we write L{A) and R{A). If A is commutative, these ideals are 
two-sided, so that L* = L etc., and L = R = B, so that H{A) = I{A) = L{A) = R{A). 

Proposition C.82. The set HiA) is a complete lattice under inclusion as the partial 
order, with inf and sup of any subset S C H{A) given by 

/\s = (C.182) 

Y5 = f|{t/G//(A) |y Ct/vy g5}. (C.183) 

Moreover, if A is commutative, then HiA) = 1(A) = L(A) = R(A) is a frame. 

Proof. The defining conditions on hereditary subalgebras of A are preserved by ar¬ 
bitrary intersections, which means that H{A) has infima of arbitrary subsets, given 
by (C.182). This implies that H(A) also has arbitrary suprema, given by (C.183), 
which is a standard formula in lattice theory. Hence H(A) is a complete lattice. 

The last claim follows from Corollary C.84 below (and the ensuing fact for topol¬ 
ogy). It may also be proved directly, using the fact that H(A) = 1(A). □ 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



C. 11 Gelfand topology as a frame 


689 


Proposition C.83. LetX be a locally compact Hausdorjf space. Then the map 


&{X) 4 H{Co{X))- (C.184) 

U ^ Co(t/), (C.185) 

where Cq{U) is seen as a subspace of Co{X), is a frame isomorphism, with inverse 

H{Co{X)) 4 ff{X); (C.186) 

B X\Fb, (C.187) 

where, for any subset B ClCo{X) one defines the (necessarily closed) set Fg C X by 

FB = {xGX\fix)=OyfGB}. (C.188) 


Proof For any open U G ^(X), we may regard f gCq{U) as an element of Co(X) 
by extending / to all of X through f\x\u = 0- Continuity of / is only an issue at 
boundary points of U‘^ = X\U, so take xq G dU‘^ (i.e., any neighbourhood of xq has 
nonempty intersection with both U‘^ and U). Since f{xo) = 0, to prove continuity of 
/ at xo we need to show that for any e > 0, there is neigbourhood N of xq such that 
|/(x)| < e for each x G N. Indeed, since / G Co(17), there is a compact set K C 17 
such that \f{x)\ < e for each x G U\K (and hence also for each x G X\K). Then 
xo K (since xp G U^), so, we may take the open neighbourhood N = X\K. 

Since the ordering in Co{X) is pointwise, it is trivial that Co(U) G H(Co(X)). The 
map (C.185) also clearly preserves the order, i.e., if t/ C V, then Co{U) C Co(V). 

Half of the proof that (C.185) and (C.187) are mutually inverse is the equality 

Co(U)=Co(X-X\U), (C.189) 

where for any F CX (usually taken to be closed), we define Co{X',F) C Co(7f) by 

Co{X-F) = {fGCo{X)\fp=Q}. (C.190) 

To prove (C.189), we just need to prove that Co(X',X\U) C Co{U), since the oppo¬ 
site inclusion has been proved before Proposition C.83. Since / G Co(X), for each 
e > 0 and each boundary point x G dlP^, there is an open neighbourhood Nx of x 
where |/| < e, as well as a compact setK CX outside which the same is true. Then 
V = n t/ is open in U, so that its complement U\V is closed in U, and 

K' = (U\V) n/f is compact in U. Clearly, |/| < e outside K', whence / G Co(U). 

Having proved (C.189), the other half of the proof of bijectivity of (C.184) is 

B = Co(X-Fb), (C.191) 

for any B G H{Co(X)). The inclusion B C Co(X',Fb) is trivial. For the converse, we 
exploit the fact that B is an ideal in Co (.If), so that Cq(X)/B is a C*-algebra by 
Theorem C.60. Let x \ Co{X) ^ Co(X) / B be the canonical projection. If / ^ B, then 
t(/) f 0. Hence there is a character (o' G L(Co(X)/B), such that co'{x(f)) yf 0. 
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Lift w' to o = 0)' o T G L (Co (X)) =X, SO that there is x G X such that (O (g) = g(x) 
for all g G Co{X). Since T{g) =0 for each g G B, we have (o{g) = 0, and hence 
g{x) = 0 for each g G B, so that x G Fg. But /(x) 0, so f ^ Cq{X\Fb), and hence 

we have proved the inclusion Cq{X-,Fb) C B. □ 

Thus C.83 could just as well have been formulated in terms of closed sets, albeit at 
the cost of inverting the partial order. Also, note the isomorphism 

CQ{X)/CoiU)^CoiX\U), [f]^fix\u- (C.192) 

Corollary C.84. For any commutative C*-algebra A, there is a frame isomorphism 

ff{E{A))^H{A). (C.193) 

This sheds new light on maximal ideals in A as points of the Gelfand spectrum L(A), 
cf. Proposition C.13. We need a lemma that applies to any frame ff{X). A prime 
element P G ff(X) is an element P fT such that U AV < P iff U < P ov V < P. 
For a point ; ff{X) -G 2, we write ker(p^') for {C G ^{X) \ p^^{U) =0}. 

Lemma C.85. For any frame ff{X) (i.e. locale X), there is a bijective correspon¬ 
dence between points p^^ : ff{X) ^2 ofX and prime elements P G ff{X), viz. 

P = \/ker(p^*); (C.194) 

p-\U)=QmU<P. (C.195) 

Under this correspondence, the topology on Pt(A) is given by the Zariski topology, 
whose closed sets Fp consist of all QflP, where P is some prime element of ff{X). 

Proof The requirement that be a frame map implies the following properties 
of its kernel K — ker(p^*): T f: K, U AV G K iff U G K ov V G K, and G fG iff 
each V G 5 is in /G. Any subset K <Z ff{X) satisfying these properties in turn defines 
a point p of X whose kernel is K. Then P — \jK \s s. prime element of ff{X), and 
conversely, K (and hence p) may be recovered from P as its downset K = fP. 

The given topology on the set of prime elements is a rewriting of (C.175). □ 

The prime elements of //(A), where A is a commutative C*-algebra, are the prime 
ideals in A, i.e., the proper ideals J C A such that J 1 J 2 C J iff Ji C A or ^2 L A, for 
any ideals Ji,J 2 of A (closed by definition, like 7); note that J 1 J 2 = J\ 072- 

Theorem C.86. 1. The frame H{A) of hereditary subalgebras of a commutative C*- 
algebra A is spatial, with Pt(//(A)) = E{A) as topological spaces. 

2. The prime elements of H{A) are the maximal ideals of A, so that, equipping the 
set yff (A) of maximal ideals of A with the Zariski topology, also (A) = E (A). 

Proof. 1. Proposition C.83 bijectively relates prime elements in H{A) to meet- 
irreducible sets in E (A). The description of sobriety in terms of meet-irreducibility 
after (C.176), which applies because L(A) is locally compact Hausdorff and 
hence sober, then bijectively relates these meet-irreducible sets to points of L(A). 
2. Proposition C.13 in turn relates points of E (A) to maximal ideals of A. □ 
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C.12 The structure of C*-algebras 

Having understood the structure of commutative C*-algebras, we now turn to the 
general case. We already know that the algebra B{H) of all bounded operators on 
some Hilbert space // is a C*-algebra in the obvious way (i.e., the algebraic op¬ 
erations are the natural ones, the involution is the operator adjoint a ^ a*, and 
the norm is the operator norm of Banach space theory). Moreover, each (operator) 
norm-closed *-algebra in B{H) is a C*-algebra. Our goal is to prove the converse: 

Theorem C.87. Each C*-algebra A is isomorphic to a norm-closed *-algebra in 
B{H), for some Hilbert space H. Equivalently, for any C*-algebra A there exist a 
Hilbert space H and an injective homomorphism 7t: A B{H). 

A homomorphism 7t: A —r B{H) is called a representation of A on H. The equiva¬ 
lence between the two statements in the theorem follows from Theorem C.62. 

Let us note that Theorems C.8 and C.87 harmonize as follows: any measure ji 
on X satisfying /r(t/) > 0 for each open U Cl X leads to an injective representation 
of Co{X) on L^{X,pl) by multiplication operators, that is, Tt{f) = mf, cf. (B.238). 

The proof of Theorem C.87 uses the elegant GNS-construction, named after 
Gelfand, Naimark, and Segal, which is important in its own right. We initially as¬ 
sume that A is unital. First, we call a representation n cyclic if its carrier space H 
contains a cyclic vector Q for n, i.e., the closure of n{A)Q. coincides with H. 

Theorem C.88. Let O) be a state on a C*-algebra A. There exists a cyclic represen¬ 
tation Ttfo of A on a Hilbert space with cyclic unit vector Qi^, such that 

CO{a) = aGA. (C.196) 

Proof We first give the proof in the special case that A has a unit 1a, and (O {a*a) >0 
for all a fO. Define a sesquilinear form ) on A by 

{a,b) = 0){a*b). (C.197) 

This form is positive definite by definition of a state, so that we may complete A in 
the ensuing norm 

||a||ffl = y/(oia*a), (C.198) 

to a Hilbert space called Hay. For each a G A, we then define a map 

7t(o{a):A^A; (C.199) 

no){a)b = ab. (C.200) 

Regarding A as a dense subspace of Ha, this defines an operator 7t(o{a) on a dense 
domain in Ha- This operator is bounded, since (C.94) implies 

||;r®(fl)||<||a||. (C.201) 
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Hence 7t(o{a) may be extended from A to //« by continuity, and we obtain a map 
Kod'.A-^ B{H(o)- Simple computations show that is a representation. The special 
vector Qco is the unit G A, seen as an element of //e,: its cyclicity is obvious, and; 

= (n^,no>) = (0(1^1 a) = (0(1a) = 1; (C.202) 

(I2o,,7!:a>(a)I2o,) = ©(l^flU) = 0)(a). (C.203) 

Under our standing assumption a>(a*a) > 0 if a ^ 0, this not only proves Theorem 
C.88, but also Theorem C.87; for %(o(a) = 0 implies \\%a{a)Q.(a\\'^ = 0, whose left- 
hand side is precisely {Qa, 7tm(a*a)£2(o) = (o{a*a). Thus Ttm is faithful. 

In general, a C*-algebra may lack such states, and we must adapt the proof of 
both theorems. The GNS-construction is easy: for an arbitrary state O), we introduce 

Noy = {aGA\co{a*a)=0}. (C.204) 

If fl(o is the image of a G A in A /Na, we may define an inner product on the latter by 

{aa„b(o) = co{a*b); (C.205) 

this is well defined and positive definite, and we define the Hilbert space Ha, as the 
completion of A/No, in this inner product. Furthermore, we define 

Ka,{a) :A/N(o —>■ Ha,', (C.206) 

na,{a)b(o = {ab)a,', (C.207) 

this is well defined, because No, is a left ideal in A by (C.94). Finally, we define 

^2« = (U)«. (C.208) 

The proof that everything works is then a simple exercise. Another way to look at 
the cyclic vector Qa> is to let (O define a linear functional S): A /No, —C by 

®(a(o) = (C.209) 

this functional is continuous on A/No, C Ha,, because \ (o{a)\^ < 0 ){a*a) = 
as follows from the Cauchy-Schwarz inequality for the positive semidefinite form 
(C.197). Hence by Riesz-Frechet there is an implementing vector Qq, such that 

C0{a) = {Qa„aa,)- (C.210) 

Finally, when A has no unit, in defining Qa> we either use the GNS-construction for 
the unitization A and restrict 7td,{A) to A to define na,{A), or use (C.210). □ 

One of the nicest feature of the GNS-construction is the link between purity of 
the state co and irreducibility of the corresponding representation tt®. 
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Definition C.89. We call a representation n of a C*-algebra A on a Hilbert space 
H irreducible if the only closed subspaces K of H that are stable under %{A) (in 
the sense that if \j/ G K, then 7t{a)\j/ G K for all a GA) are either K — H or K = {0}. 

Theorem C.90. Each of the following conditions is equivalent to irreducibility: 

1. 7i{A)' = C • 1, where S' is the commutant ofS C B(H) fSchur’s Lemmaj; 

2. n(A.)" =B{H); 

3. Every vector in H is cyclic for 7r(A). 

Furthermore, if CO is a state on A, then CO is pure iff the corresponding GNS- 
representation TCq) is irreducible. 

Proof If k{A)' ^ C • 1, then 7 r(A)' must contain a nontrivial self-adjoint element 
a (as it is a *-algebra), and hence also a nontrivial projection e (as the spectral 
projections = I 4 (a) of a, defined as in Theorem B.102, lie in 7 r(A)', too). But if 
e G 7c(Ay , then eH is stable under n{A), and hence n cannot be irreducible. Thus 
irreducibility implies 1. Conversely, if 7 r(A)' = C • 1, then n must be irreducible by 
the same argument, since if not, any projection onto some proper stable subspace 
K for n would be an nontrivial element of k{A)'. The equivalence 1 o 2 is clear, 
since (C-1)' = B(H). Similarly, if cp G H would fail to be cyclic for n, then 7t{A)cp^ 
would be a proper, ;r(A)-stable subspace of H, so that irreducibility implies 3. The 
converse is trivial, since if K C K were stable for 7t{A), then 3 cannot hold. □ 

Another useful result relates general representations to GNS-representations. We 
call two representations Tti : A ^ B{Hi), i = 1,2, unitarily equivalent if there is a 
unitary u : Hi ^ H 2 such that uTti (a)u* = n 2 {a) (or un\ (a) = n 2 {a)u) for each aGA. 

Proposition C.91. Let n :A^ B{H) be a cyclic representation ofH. If xj/ G H is a 
cyclic unit vector for %, then 


co{a) = {\i/,n{a)\i/) (C.211) 

is a state on A, whose Gt<IS-representation Tta is unitarily equivalent to %. 

Proof Define u ; H^d -g H first on na){A)Qoi) (which is a dense subspace of H) by 

u%(o{a)Q.a = %{a)f. (C.212) 

Using (C.211) and (C.196), we then obtain 

\\7tco{a)Q(of = C0{a*a) = {\l/,7t(a*a)\j/) = \\7t{a)\l/f. (C.213) 

This shows that u is well defined as well as isometric, so that it extends to H^ by 
continuity. Its image is then the closure of n{A)\i/, which is H, since \j/ is cyclic by 
assumption. Thus u is surjective and hence unitary. Finally, we compute 

unto{a)n(o{b)Q(a = n{a)n{b)\i/ = %(a)u%(o{b)Q.(o, (C.214) 

so that un(o{a) = n{a)u on the dense space n(o{A)Qio, and thence everywhere. □ 
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We now take up the proof of Theorem C.87, preceded by some general remarks 
on direct sums of Hilbert spaces and representations. First, if {Hi,H„) is a finite 
family of Hilbert spaces, one may form the direct sum H — Hi (B ■■■ (BHn, initially 
merely as a vector space, and subsequently also as a space with inner product 

n 

{{(pi,...,(p„),{\lfi,...,\irn)) = Y,{(Pi,Vi)- (C.215) 

/=1 

It is easy to see that H is complete in the ensuing norm 

= (C.216) 

i=\ 

Some authors write i/i © • • • © V^i-i-f V^h, or y/i H-h for (i^i,..., V^„). 

Moreover, if (tt,) is a family of representations Tti :A ^ B{Hi), then one obtains 
a new representation 0, Ki of A, called the direct sum of the tt,, by 

0 7r;(fl)(v/i,...,r«) = (;ri(fl)v/i,...,7r„(fl)v/„). (C.217) 

i 

This construction works for arbitrary families of Hilbert spaces {Hx) and represen¬ 
tations {Kx), where x & X for some index set X. First, the elements of H — ®xf^x 
are families (xj/) = (il/x)xex, where y© G Hx, such that 

||(V/)f =sup^||V4|||,,<-, (C.218) 

where the supremum is over all finite subsets F of X, so that the sum is defined as in 
(B.l 1). In that case, the obvious linear operations (i.e., ((i//) + (<p))x = Wx + <Px and 
(A(v/))a: = A • V©) are defined within H, since for each pair (^), (yr) G H we have, 
from the triangle inequality for the norm in each finite direct sum Hf = ®xeF 

Ellv^+^^llk < + Ell^^ll^) < ll(r)f+ll(^)f- 

\xeF J \xeF ) \xeF ) 

The supremum over F gives || (v^) + (^)||, which is therefore finite and satisfies the 
triangle inequality for the norm. Similarly, the natural inner product in H is well 
defined, this time by the full Definition B.6, with V = C and f{x) = (tpx, Wx)hx^ i-6-> 

((^).(r)) = E(^^’V©)//.- (C.219) 

To see this, we apply Cauchy-Schwarz first in each Hx and then in f-{X) to obtain 
|((^),(r))| < E <Ell^^llllV^-‘ll ^ ll(^)llll(V^)ll <°°- (C.220) 

xex X 
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Finally, the proof that the direct sum Hilbert space 0^ Hx is complete in the norm 
(C.218) is similar to the case where Hx = C for each x, i.e., H = £^{X), cf. Theorem 
B.9. Let (v/)„ be a Cauchy sequence in H, consisting of sequences {Yx)n = ir* 
each Hx- For each finite F cX and e > 0, we must have LxgfWVx"'’-W x ”)ll< e for 
sufficiently large n,m so that each (V4)„ must be Cauchy in Hx, with limit The 
ensuing set (y/) of vectors lies in H by the argument following (B.19), and the given 
Cauchy sequence ((//)„ converges to (y/), again by the same proof as for (^{X). 

If one has a family (tTjc) of representations Ttx'.A^ B{Hx), their direct sum n = 
^x'^xi defined by (7r(a)(i//))x = 7tx{a)\j/x, is a representation of A on H. Indeed, one 
has ||7r(a)|| = sup^{||7rv(a)||}, and since we have ||7rr(a)|| < ||a|| for each x, we also 
have ||7r(fl)|| < ||a||, so that nia) € B{H), and hence n maps A into B{H). 

Our first use of such direct sums shows that cyclic representation are the building 
blocks of any representation n, at least if we require n to be nondegenerate in the 
sense that 7t{a)\lf = 0 for all a G A and G H implies y/ = 0. 

Proposition C.92. Any nondegenerate representation % ; A B{H) of a C*-algebra 

A on a Hilbert space H is a direct sum of cyclic representations of A. 

Proof Consider families {\l/x)xex of nonzero vectors in H with the property that 

{7t(a)\j/x,7t(b)\l/x>} ^0, (C.221) 

for all a,b GA and all x ^ x'. Such families are partially ordered by inclusion, and 
an easy application of Zorn’s Lemma shows that there is a maximal such family. 
For this family {il/x)xeX 7 we define Hx as the closure of n{A)\j/x in H. Since ;r is a 
homomorhism, each Hx is stable under n{A), and hence the restriction nx{a) of n{a) 
to Hx defines a representation of A, which is cyclic by construction. It follows that 
H — ®x^^x and n — 0^ Kx, and so the claim has been proved. □ 

Our second use is the proof of Theorem C.87, where we have to solve the problem 
of the possible lack of injectivity of Ka, in our previous preliminary proof. 

Proof To do so, we replace Ha by the crazy Hilbert space He = 0(ogp{a) H(o, where 
P{A) is the pure state space of A. The Hilbert space He carries a representation n = 
®(bg/>(a) The point is that if %{a) = 0, then n{a*a)£2(o = 0 for each (O G P(A), 
which by (C.196) implies a)(a*a) = 0. Proposition C.15 then gives a(a*a) = {0}, 
from which the spectral radius formula (C.55) gives ||a|| = 0, and hence a = 0. It 
follows that TT is injective, and Theorem C.87 is proved. □ 

It should be noted that this proof relies on shock and awe kind of overkill (though 
nothing compared to the even crazier space Hee = 0(bgs(a) which is tradition¬ 
ally used in the above proof), in that He is far larger than necessary (indeed, in all 
but the most trivial cases, H is non-separable). For example, already for A = M 2 (C) 
we have P{A) = S^, so that He = 0ft)Gs2C^; this Hilbert space is non-separable, 
whereas A has an injective representation on C^. More generally, Bq{H) or B{H) 
has an injective representation on H by definition, whereas He is non-separable. In 
the commutative case, A = Co(2f) yields the non-separable He = 0;tGX*C^’ although 
A has an injective representation on the (typically) separable space Lf{X,pL). 
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As a nice illustration of the GNS -construction, let us treat this example in more 
detail (cf. §1.5 for the simple case where X is finite). If /r is some state on Co{X), 
then by Theorem B.24, there is a unique probability measure jj. on X such that 

coif)= [ d^f, fGCoiX), (C.222) 

Jx 

cf. (B.39). It follows from (C.204) and (C.222) that 

A^<o = |/GCo(X):^r/Al|/p = o|. (C.223) 

In particular, the support of /r is X iff Nm = {0}, in which case A/Na, = Co{X). 

In the opposite case where cu is a pure state, i.e., (O — (Ox for some x G X, with 
(Ox{f) = f(x), one has Nm = {/ G Co{X) \ f{x) = 0}, so that A/Ao = C, under the 
map [/] I—>■ f{x). In general, from (C.206) - (C.207) we obtain 

Ho>=L\x,^y, (C.224) 

nM) (C.225) 

= lx, (C.226) 

where = fxj/, cf. (B.238). Analogously to (B.331), we then obtain 

7r®(Co(A))"=L~(A,Ai). (C.227) 

The state O), initially defined on the commutative C*-algebra Co(A), then has a 
normal extension to the commutative von Neumann algebra L°°(A,/r), cf. (C.222). 

More generally, if A is an arbitrary commutative C*-algebra and O) is a state on 
A, then, writing E (A) for the Gelfand spectrum of A as usual, we have 

Ha,=L^{EiA),^y, (C.228) 

nM) = (C.229) 

(C.230) 

where / G Co(Z(A)) is the Gelfand transform of / G A, and /r is the probabililty 
measure on Z(A) defined by 


«(/)=/ diif. (C.231) 

JL(A) 

With this commutative case in mind, some authors would call a pair (A,®), where 
A is a general C*-algebra and ® is a state on A, or, alternatively, A is a general 
von Neumann algebra and ® is a normal state on A, a non-commutative proba¬ 
bility space. As such, ‘aordinary” probability theory (at least, on locally compact 
Hausdorff sample spaces) is merely the commutative case of a much more general 
“non-commutative probability theory”. 
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If Ha and Hg are Hilbert spaces, their algebraic tensor product Ha ® Hg typically 
fails to be a Hilbert space in the obvious way, since it is not complete (unless one 
of the factors is finite-dimensional). Similarly, the algebraic tensor product A (g)B of 
two C*-algebras A and B usually fails to be a C*-algebra. However, the second case 
is far more complicated then the first: for Hilbert spaces there is a canonical norm 
on the algebraic tensor product and hence a canonical completion of Ha 0 Hg into a 
Hilbert space HA®Hg. For C*-algebras, on the other hand, there is an embarrasment 
of riches, in that there are are many norms turning the completion A (§)Z? of A (g)B in 
some such norm into a C*-algebra. However, if A or Z? is nuclear, there is just one 
possibility; see below. For example, this applies of A or Z? is finite-dimensional. 

Let us first review the (algebraic) tensor product of two vector spaces. A and B. 

Proposition C.93. Let A and B be (complex) vector spaces. There is a vector space 
called A®B, in words the algebraic tensor product of A and B (over C), and a 
map p : A X B ^ A (g) B, such that for any vector space C and any bilinear map 
j3 : A X B ^ C, there is a unique linear map fi' \ A®B C such that fi = fi' op. 

In other words, the following diagram commutes: 

AxB —^ A(g)Z? 



This universal property also shows that A 0 Z? is unique up to isomorphism. 

Proof. In preparation for an explicit construction of A 0 B, define the (complex) free 
vector space on any non-empty set X asCc(X), where X has the discrete topology 
(i.e., Cc{X) consists of all functions f : X ^ C with finite support), and pointwise 
operations. For each y GX, the delta-function 5y GCc{X) is defined by 5y{x) = 5xy, 
so that each element / of Cc{X) is a finite sum / = where A, G C and x, G X. 

If A and B are (complex) vector spaces, A 0 Z? is the quotient of the free vector 
space Cc{Ax B) onX = A x Z? by the equivalence relation generated by the relations: 


^{ai+a2,b) ^ ^{a2,b)^ (C.233) 

^{a,bi+b2) ^ ^{a.bi) ^{a,b2)^ (C.234) 

~ (C.235) 

- 5^a,u)- (C.236) 

For a G A,b G B, the image of 5^^ in A 05 is called a 0 Z?, so that by construction, 

{ai+a2)'^b = ai'^b + a2'^b; (C.237) 

a(Si (bi+b2) = a(Sibi+a(Sib2', (C.238) 

X(a®b) = (Xa)®b = a® (Xb). (C.239) 
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Elements of the algebraic tensor product A®B may therefore be written as finite 
sums c = Oi 0 bi, with a, G A, bi G B, subject to the above relations. 

Now consider some bilinear map j3 :AxB^C. We extend j3 to a map 

j3 : Q(A xB)^C; 

P = 'E^iPiai,bi) 

Since j3 is bilinear, it respects the above equivalence relation, so that it duly quotients 
to j3': A (g)B C, upon which the property j3 = j3' op holds by construction. Finally, 
since p is surjective the latter property uniquely determines j3'. □ 

Equivalently, A(g)B is the quotient of formal sums Y,i{oi,bi) by the subspace 
consisting of those sums for which there are cOa G A* and cOb G B* such that 
0)5(0,■)a)B(h,) = 0. Similarly, it is useful to regard A®B as a subspace of the 
vector space L{A* ,B) of linear maps from the dual A* to B through the map 

^ a,- (g) h,-: 0 )a ^ COa (<»a G A*); (C.242) 

i i 

this map is injective by Corollary B.45.2, since we may assume the h, to be linearly 
independent. Using the canonical embedding B ^ B** of Proposition B.44, this in 
turn yields an injection A (g)B ^ L(A* xB* ,C), i.e., the space of bilinear maps from 
A* X B* to C, given on arguments (oa, 0)5) by 


(C.240) 

(C.241) 


Y^ai®bi ; {(Oa,(Ob) ^Y^(OA{ai)(OB{bi). (C.243) 

i i 

Proposition C.93 turns this into an injection A®B ^ L{A* (g)B*,C), given by 

Y^ai^bi ; (cos); Yi^^)Mi)i^B)j{bi). (C.244) 

i J iJ 

If A and B are Hilbert spaces, we call them Ha and Hb, denote their elements by 
a and j3, respectively, and attempt to define a sesquilinear form on Ha®Hb by 

{Y^aj^lij,Y,cCi^lii) (C.245) 

J i iJ 

It is a non-trivial fact that this form is well defined, because representations Cti <g) j3,- 
of vectors in Ha (g) Hb may not be unique. For example, if Ha — Hb = H = C", and 
(a,) and (a,Q are two bases of H, then a,- (g) a,- = Y.i O!,- <g) ct,- (to see this, take inner 
products with an arbitrary elementary tensor xj/^cp, yielding the same result). 

To resolve this, we note that the injection Ha ®Hb ^ L{Hl X Hg,C) just dis¬ 
cussed combines with the isomorphism H* = H of Theorem B.66 to an injection 
Ha®Hb ^ L(Ha X Hb,C), i.e., the space of bi-anti-linear maps from Ha x Hb to 
C. Proposition C.93 turns this into an injection Ha®Hb ^ L{Ha ^Hb,C), viz. 
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^ ^(a',a,)//^(j3',j3,)//5. (C.246) 

i .7 ‘J 

Consequently, if a, 0 j3, = 0, then the right-hand-side of (C.245) is zero, too, since 
it is the image of Ct) <8 under the zero map. Hence (C.245) is independent of 
the choice of representatives in the sum a, 0 j3,, and by hermiticity of the form, 
this equally well applies to the other entry Oi.) ® Pj- 

It remains to show that (C.245) is an inner product, i.e., that it is positive definite. 
To see this, for some given vector a,- 0 j3, in Ha 0 Hg one may take the linear span 
H'j^ of all a, in Ha, which is a Hilbert space, and pick a basis (u,) in H^. Absorbing 
the scalars in the j3/, we may therefore write Y,i Oii ® Pi = L/t C) P”, so that 

{Y^at(^pi,Y^ai(^pP^Y^{v,(^Pi:,Vi(^pP)^Y^\\P!:\\l>0, (C.247) 

i i k,l k 

with equality at the end iff each j3" = 0, and hence Y,i Ctj 0 Pi = 0. 

Finally, we complete Ha®Hb in the norm defined by the inner product (C.245); 
with abuse of notation the ensuing Hilbert space is often just called Ha®Hb, but it 
would be more precise to denote it by Ha®Hb, as we will usually do. 

It is easy to show that if and are bases for Ha and Hg, respectively, 

then is a basis of Ha®Hb. Also, if {X,E,IJ.) and {X',L',ii') are a- 

finite measure spaces with X and X' well behaved (e.g., Polish), so that the L^-spaces 
are separable, one has a natural isomorphism 

L^{X,E,pi)®L^{X',E',ii')'^L^{XxX',ExE',ix x pi'), (C.248) 

obtained as the closure of the isometric (and hence bounded) map that sends the 
vector Y.i Wi wl the function (x,x') i—v/, (x) V/'(x') on A xX'. Here E x E' 
is the smallest (7-algebra on A x A' that contains all sets AxA',A€E,A'€ E', and 
jj. X p' is the familiar product measure defined on elementary measurable sets by 

pxp'{AxA')=p{A)p'{A'). (C.249) 

We now turn to tensor products of C*-algebras. If A and B are C*-algebras, then 
the algebraic tensor product A^B of A and B (just seen as vector spaces) is endowed 
with a natural multiplication and involution, given by linear extension of 

(ai (ghi) • ((72 0 ^ 2 ) = (( 71 ( 72 ) 0 (^ 1 ^ 2 ); (C.250) 

{a®b)* =a*®b*, (C.251) 


respectively. Thus A 0B is a *-algebra, and Proposition C.93 specializes to: 
Proposition C. 94 . IfC is a *-algebra and if a bilinear map P :AxB^C satisfies 

Piaia2,bib2) = P{ai,a2)p{bi,b2); p{a*,b*) = p{a,b)* , (C.252) 

then p factors through A® B (now seen as a *-algebra), as in (C.232). 


“Pu^tJC. T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



700 


C Operator algebras 


The proof is similar. In order to turn A®B (seen as a *-algebra) into a C*-algebra, 
we need a C*-norm, i.e., a norm on A (g)B satisfying the C*-axioms (C.l) - (C.2). 
If such a norm exists, we denote the completion of A (g)B in that particular norm by 
A®B, where typically || • || and hence 0 carry some label. This completion A®B is 
a C*-algebra in the obvious way. There will be no shortage of such norms! 

For example, suppose A C B{Ha) and B C B{Hb). For each aGA,we form the 
operator fl® Ig on Ha ®Hb (where Ig is the unit of B{Hb), which is also the unit of 
B if it has one). As in (C.247), we may assume that generic elements of Ha®Hb take 
the form Y.k '^k ® Pk, with the Vk orthonormal in Ha and G Hg- We then estimate 


(a® 1 b) 



2 2 
= Y^{aVk)®^k 

k 


<Y^\\{aVk)(^likf 

k 


< a I 



(C.253) 


Hence a® Ig is bounded on the pre-Hilbert space Ha®Hb, and extends to abounded 
operator on Ha®Hb by continuity; this extension is usually called a 0 1b, too. Sim¬ 
ilarly, any b G B defines a bounded operator 1 0 on Ha®Hb, and since 


a®b = {a^lg) ■ {lA®b), (C.254) 

all elements a, ®bi of A® B extend to elements of B{Ha®Hb). Now define 

II = II (C.255) 

This is clearly a C*-norm on A 0 B. Moreover, it is a cross-norm, in that 


||a®^||min = ||fl||||^||. (C.256) 

This construction generalizes to any two C*-algebras, since by Theorem C.87 we 
have injective representations tta : A —> B{Ha) and Ttg ■ A ^ B(Hb) of A and B, 
respectively, and it is easy to verify that the norm || • ||min on A 05 and ensuing 
completion A^min^ are independent of the chosen representation. Furthermore, 

II ^11 min “ sup{||7rA 0 Mc)\\B{HAmB)}^ (C.257) 

where TtA and Ttg run through all representations of A and B, respectively. The en¬ 
suing completion A^minB is called the injective tensor product of A and B. Without 
proof (which requires more advanced methods than the elementary arguments we 
use in this section), we mention that, as its name suggests, || • ||n,in is the smallest 
C*-norm on A 05. This has a very important consequence: 

Proposition C.95. Any C*-norm || • || onA®B satisfies ||a(8)fe|| = ||a||||fo||. 

In other words, any C*-norm || • || on A 05 is a cross-norm. To prove this from the 
minimality of the spatial norm, we need a lemma of wider interest. 
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Lemma C.96. ^ || • || /s any C*-norm on A® B, then for all a £ A and b G B, 


||a(g)/t|| < \\a 


(C.258) 


Consequently, for any C*-norm on A®B and any c G A® B, we have the bound 


<inf<^El 


an 


,c = 


ai C 


(C.259) 


Proof In any C*-algebra A, if a > 0, we have ||a|| < 1 iff < a. This is trivial for 
A = C{X), and in general can be proved within C*{a) C A, since C*{a) = C{G{a)). 
Now take a G A and b G B such that a > 0, b > 0, ||fl|| < 1, and ||fo|| < 1, so that 
{a®b)^ = a^®b^ < a® b^ < a® b, and hence ||fl(8)lt|| < 1. For general a>0,b>0, 
rescaling to a/||fl|| etc. gives (C.258). For general a,b altogether, we compute: 

||a(g)fo||2 = \\(^a®b)*{a®b)\\ = \\a*a®b*b\\ < ||fl*fl||||l7*l7|| = ||a|p||l7||^. (C.260) 


Eq. (C.259) then follows from the triangle inequality on the norm. □ 

If A and B each have a unit, there is a simpler proof: as in (C.254), we have 


IlflO^II = II (flO 1 b)(1a 0^)11 < ||flO 1 b||||1aC)^|| = ||a||||^||, (C.261) 


where we used ||aC) 1 b|| = ||a|| etc., which is the case because the map a i—>■ a® Ig 
from A to A®B is injective and hence is an (isometric) isomorphism onto its image. 
We now prove Proposition C.95. 

Proof For any C*-norm || • ||, we have ||aC)fo|| > ||aC)fe||niin = IlfllHI^II^ since the 
spatial norm is itself a cross-norm, cf. (C.256). Then (C.258) gives equality. □ 

In view of (C.259) and the existence of at least one C*-norm on A 0 B (namely 
the spatial one), it makes sense to define the maximal C*-norm on A 0 B by 


Y^ai®bi 

= sup<^ 

Y^ai®bi 

i 

max \ 

i 


, II • II is aC*-normonAC)B 


This is clearly a C*-norm, and hence it is also a cross-norm, i.e.. 


llaO^llmax = ||a||||^||- 


(C.262) 


(C.263) 


This property may be proved without using the deep result that the spatial norm is 
the minimal one (which in turn led to Proposition C.95); all we need is the inequality 

IklUin < llcllmax, (C.264) 

for any c G A® B, which follows from the definition of || • Umax^ upon which (C.264) 
may be proved in the same way as for general C*-norms. The completion A(§)niaxfi 
of A 0 B in the norm || • ||max is called the projective tensor product of A and B. 
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If we define representations of the pre-C*-algebra A 0 Z? on Hilbert spaces in the 
same way as for C*-algebras, i.e., as linear maps n \ A®B ^ B{H) that preserve 
the product (C.250) and the involution (C.251), we obtain 

II ^11 max — sup{||;r(c)||}, (C.265) 

where c = a,- ®bi&A®B, and n runs through all representations of A 0 B. Indeed, 

according to Theorem C.87 there exists an injective representation n of A^max^, 
so that ||c||niax = IIfor each c € Ai^jnaxB, and hence also of each c G A 0 
B. Furthermore, any representation of A 0 B yields a cross-norm, so that (C.265) 
follows. This also shows that the supremum in (C.265) is actually attained. 

In what follows, we restrict ourselves to the case that A and B have a unit, which 
suffices for our applications, but the claim is true in general (with a slightly more 
complicated proof, involving either approximate units or unitizations). If A and B 
each have a unit, so does A C)B, viz. l^i 0 Ig. States (0 on A C)B are then defined 
as for unital C*-algebras, i.e., as positive linear functionals (in the usual sense that 
(o{c*c) > 0 for any c G A C)B) that map the unit 1^ CiAg of A C)B to 1. 

Proposition C.97. Let A and B be unital. Then each state on A® B is continuous 
with respect to the || • \\msx-norm, and hence extends to a state on the maximal tensor 
product AC)max7^- Thus identifying states on A®B and on A®^xB, we have 

S{A®B)=S{A®jnaxB). (C.266) 

Proof. Let O) : A 0 B —C a state. Although A®B may not be a C*-algebra, the 
GNS-construction Theorem C .88 goes through as if it were. The reason is that the 
only delicate point, namely boundedness of {a®b), may be proved from (C.94), 
just as in the usual case. Indeed, for a G A, b G B, and c G A C)B, we estimate 

\\n(o{a®b)cco\\^ = (o{c*{a®b)*{a®b)c) = (o{c*{a*a®b*b)c) 

= 11 a 0 Z? 11 max 11 C (011 ) 

SO that ||7r(o(fl0Z>)|| < ||a 0 Z>||max 5 and hence na){a®b) may be extended to the 
completion Hto of (A ® B) /Nm by continuity. Here we used the facts that: 

• {a®b)*{a®b) = a*a®b*b, so that the right-hand side is positive in AC)B. 

• 0 < a*a < ||fl|plA and 0 < b*b < ||Z>|P1b, as A and B are C*-algebras, cf. (C.83). 

• If c' > 0 in A C)B, then c*c'c > 0, as for C*-algebras, see the argument preceding 
(C.93). The argument is the same: c*c'c = c*d*dc = {dc)*dc > 0. 

Wriiting Qa = (1^0 1b)« for the cyclic vector of//m, as in (C.208), for any element 
c G A C)B we obtain, using (C.265) in the final inequality, the decisive bound 

|a)(c)| = \{Q^,no,{c)Q^)\< ||7r»(c)|| < ||c||max. (C.267) 

In other words, (O is continuous with respect to the || • Umax-norm, and since the latter 
is dense in A^max^, the state extends to the completed tensor product by continuity. 
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It follows from (C.267) and (b(1a 0 1b) = 1 that ||a)|| = 1 as a functional on A (g)Z? 
equipped with the || • ||max-nonn, so that the in question extension has the same 
norm, and hence by Proposition C.5 is a state on A(§)niaxfi- Conversely, a state on 
A Cimaxfi restricts to a state on A (g)B, since the two *-algebras have the same unit 
and (trivially) if c is positive in the latter, then so it is in the former. □ 

The above proposition concerns extensions of arbitrary states on A (g)Z?. However, 
product states on A 0 Z? can be extended to any completed tensor product A^B. 

Proposition C.98. If (Oa and COb are states on A and B, respectively, then the corre¬ 
sponding product state (Oa ® (Ob on A®B, defined as in (C.243) by 


(Oa®(Ob 



= Y,(OA{ai)(OB{bi), 

i 


(C.268) 


is continuous with respect to any cross-norm || • ||, and hence extends to A®B. 

Proof Since the spatial norm is minimal among all cross-norms, it is enough to 
prove continuity with respect to || • ||min- As in the proof of Proposition C.97, we 
form the GNS-representation induced by (Oa ® (Ob, so that for any c GA^B, 

{(Oa (Ob){c) = 1 ^COj{<SiCOb {c)Q(Oj^(^(og) ■ (C.269) 

Now consider the representation (A) 0 TtagiB) on H(o^ with cyclic vector 

0 Qag. Writing c = a,- 0 bi as usual, a simple computation gives 

0 tt(og){c)£2(o^ 'SiPI(ob) 

/ / 

= {(OaS(Ob){c). (C.270) 

Using the same reasoning as in (the proof of) Proposition C.91 (which does not ap¬ 
ply literally, since it is about C*-algebras), it follows from (C.270) that (A S 

B) is unitarily equivalent to (A) 0 n(Bg (B), so that, using (C.270), analogously to 
(C.267) but this time using (C.257) at the end, we have 

|(cdAOa)B)(c)| < ||;rffl^(g);?rfflB(c)|| < ||c||min. □ 

As an application, analogously to (C.248), we show that: 

Proposition C.99. For any locally compact Hausdorjf spaces X,Y and any cross¬ 
norm on Co (A) (8 )Co(T), with completed tensor product Co(A)(§)Co(T), we have 

Co(A)®Co(T)=Co(AxT), (C.271) 

under the isomorphism given by continuous extension of the map f ® g '-G fg \ 
{x,y) I—>■ f{x)g{y) from the algebraic tensor product Co(A) SCo{Y) to Co{X x Y). 
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Proof. We just prove the unital case, where X and Y are compact. 

Let X & X and y GY, and take the corresponding evaluations maps and ev^ 
on C{X) and C{Y), respectively. These are multiplicative states, cf. Proposition 
C.19. Then eV;^(g)evjc is a nonzero multiplicative state on C{X) (g)C(T), and hence 
also on C{X)^C{Y), cf. Proposition C.98. This gives an injection of X x T into 
X{C{X)'^C{Y)), i.e., the Gelfand spectrum of C{X)'^C{Y), cf. §C.2. 

Conversely, the restriction coi of any co G L(C(X)i^C(Y)) to C(X), given by 
COi(f) = co(f0 ly), is multiplicative, as is the restriction ©2 of co to C{Y), de¬ 
fined by t02(g) = Co{\x ®g)- Then © = ©1 0 © 2 , with ensuing injective map 
Z(C(X)(t)C(T)) -^X xY. Thus the above injection is also a surjection, and hence 
a bijection, which is easily seen to be a homeomorphism. □ 

This can also be proved without Proposition C.98, using only the second step; if 
X{C{X)i^C{Y)) fXxY, then, since i;(C(X)(t)C(T)) is closed inX xY, there are 
nonempty opens U CX and V CY such that {U x V) nZ(C(X)(§)C(T)) = 0. Now 
take nonzero functions f G Cc{U) and g G Cc(V) such that (o(f 0g) =0 for all 
© G X(C(X)0C(Y)). This contradicts the isometry (C.18) of the Gelfand transform. 

Proposition C.IOO. For any locally compact Hausdorff space X and any C*-algebra 
B, let Co{X,B) be the C*-algebra of all continuous functions f : X ^ B for which 
the function x i-J- ||/(x)||b is in Co{X), equipped with the supremum norm 

ll/ll =sup{||/(x)||B,x ex}. (C.272) 

For any C*-norm with ensuing tensor product (§), one then has 

Co{X)^B^Cq{X,B), (C.273) 

under continuous extension of the map from Co{X) ®B to Cq{X,B) defined by 

f®b^{fb:x^f{x)b). (C.274) 

We just prove this for the minimal (i.e. spatial) C*-norm; the general case follows 
from nuclearity of Co(X), cf. Proposition C.lOl below. 

Proof Take some injective representation tCb ■ B ^ B{Hb), and represent Cq{X,B) 
on lf{X)®F[B by linear extension of % : Cq{X,B) -g B{f'{X)®F[B), as defined by 

n{f)5,(g)<p = 5,® (C.275) 

where / G Co{X,B), xGX, and cp G Hb', this operator is easily seen to be bounded. 
In particular, an element fb GCo{X,B), as in (C.274), is represented by 

n{fb){5,®f)=f{x)5,®nB{b)9- (C.276) 

Denoting the representation of Co(X) on f'fX) through multiplication operators by 
i-C-, ^m(f)v(x) = f{x)\l/{x), where f gCo{X) and y/ S £^{X), we then have 

%m®'r^B{f®b) = %{fb). (C.277) 
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In this way, Co (^) is faithfully represented as a subalgebra of 

;r(Co(X,B))^Co(X,B), (C.278) 

and so the final step is merely to show that Co{X)^B is dense in Co{X,B). Indeed, 
taking X compact for simplicity (otherwise one needs a further approximation argu¬ 
ment), for given / G C(X,B) and e > 0, define a cover = {Ux)xex of X by 

Ux = {yGX\ ||/(x) -/(y)|| < £}. (C.279) 

Since X is compact, has a finite subcover {t4i) • • •, t4„ }> with associated parti¬ 
tion of unity {gxi i.e., one has € Cc{Ux^), with 0 < < 1, and 

f^gxfx) = l (xGX). (C.280) 

1=1 

Define an approximant g GC{X)^Bhy 

six) =Y.8xi'S)f{xi), (C .281) 

i 

whose image g G C{X,B) is given by g{x) = Y.igxi{x)f{xi). Then for each xGX, 
\\g{x)-f{x)\\B= Y.Sx,{x){f{xi)-f(x)) <Y,gxi{x)-e = e, (C.282) 

i g 

SO that, taking sup^, we have ||g — /|| < e. This proves the claim. □ 

SinceCo(2f xT) =Co(2f,Co(T)) underthe map /i~5> /with/(x,y) = (/(x))(y), the 
isomorphism (C.271) is a special case of (C.273). 

Another case where the choice of a cross-norm does not matter—this time be¬ 
cause no completion is even needed—is the following. Recall Corollary C.28. 

Proposition C.lOl. Let A be a finite-dimensional C*-algebra. Then for any C*- 
algebra B, A®B is complete in any C*-norm, and hence all C*-norms coincide. 

Thus A®B =A®B, though one still needs a norm on A 0 B to make it a C*-algebra! 

Proof. In view of Theorem C.163, we only need to prove this for A =M„{C), n G N. 
As in the previous proof, we use the spatial tensor product on M„{C) ®B, so let us 
faithfully represent M„{C) and B on C” and Hg, respectively, and form the Hilbert 
space C"®Hb = C” ®Hb, carrying the representation id® Ttg of Mn{C) ®B, and 
hence of the (alleged) completion Mn{C)®xnmB. Let 

n 

c = E ^ Mn{C)®B, (C.283) 

ij=l 

where (e,y) is the standard basis of Mn{C) and b'^ G B. For any such c, we have 
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I mm — 


^ eijb‘j{Vk^(p) 


ij=l 


2 

C^Hb 




(C.284) 


where (di ,... V„) is the standard basis of C”, A: = 1,... ,n is fixed, and (p G Hg is a 
unit vector. Taking the supremum over (p gives 




> W^Wb, 

min 


(C.285) 


for each fixed pair Hence any Cauchy sequence (ck) in Mn{C) takes the 
form Ck = Li'j=i where each {b'l) is a Cauchy sequence in B for fixed (i,y). 
Then, using the fact that = 1, we have 






b'i) 


< E w 

min f 7^ 1 



(C.286) 


for any c G M„(C) ^B, as in (C.283). Taking c such that b‘^ = Mmkb'j^, it follows 
that Q —>■ c in II • Umin^ i-C-^ in M„{C)^jnmB■ In particular, the limit c of any Cauchy 
sequence in M„{C) ^B with respect to the norm || • ||n,in lies in M„{C) ^B, which 
is therefore complete already and is a C*-algebra in the spatial norm. Since the 
norm in a C*-algebra is unique (cf. Corollary C.28), it follows that any C*-norm on 
M„(C) must coincide with the spatial one || • Umin- D 

It is also easy to show that 

M„{C)(^B^M„{B), (C.287) 


i.e., the n x n-matrices with entries in B, with obvious operations and norm given by 
faithfully representing B on some Hilbert space Hb, as above, and then letting M„ (B) 
act on Hg = Hb® - ■ - (BHb (i.e., n copies) in the natural way. A specific isomorphism 
Mn{C)®B —5> Mn{B) is then given by sending L”/=i Bijb'’ to the matrix {b'-'). 

Finally, one of the highlights of the theory of tensor products on A 0 B is a concept 
that apparently makes the entire theory superfluous: 

Definition C.102. A C*-algebra A is called nuclear if for any C*-algebra B, the 
norms || • Umin ond || • ||max (and consequently all C*-norms) on A® B coincide. 

The class of nuclear C*-algebras is large but not exhaustive: if H is infinite¬ 
dimensional, then Bo{H) is nuclear but B(i/) is not, even if// is separable. However: 

• Any commutative C*-algebra is nuclear (this underpins Proposition C.lOO). 

• Any finite-dimensional C*-algebra is nuclear (cf. Proposition C.lOl). 

• The (unique!) tensor product of any two nuclear C*-algebras is nuclear. 

• Inductive limits of nuclear C*-algebras are nuclear (see §C.14). 

• IfO—>^7—>^A—>^0isa short exact sequence (i.e., if 7 C A is an ideal in A and 
7? ~ A/7) in which two of the three C*-algebras are nuclear, the so is the third. 
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C.14 Inductive limits and infinite tensor products of C*-algebras 

In the main text we deal with infinite quantum systems, albeit as idealizations rather 
than physical systems that exist in reality. Mathematically, such systems arise as 
infinite tensor products of C*-algebras, which in turn are special cases of inductive 
limits, also called direct limits (categorically, these are colimits, see §E.l below, and 
as such they are unique op to isomorphism—in this case, of C*-algebras). 

Let / be a directed set (cf. Definition D.l), typically / = N with the usual order. 
Let (Ai) a family of C*-algebras indexed by /; in case that / = N, these will often be 

= (C.288) 

where B is some C*-algebra and (§)max is the projective tensor product, extended 
from two C*-algebras (as discussed in the previous section) to any finite number of 
C*-algebras in the obvious way: for any completed C*-tensor product 0, n G N, and 
C*-algebras (Ci,..., C„), we inductively define the tensor product of the latter as 

Cl®-• •(§)€„ = (Ci®---®C„_i)®C„. (C.289) 

In general, the cartesian product consists of all functions a:I ^ U,A, such 

that a{i) = a,- G A,-; we often write such functions as (a,),-, where a,- G A,-. The Axiom 
of Choice then guarantees (or, following Russell, even states) that—provided none 
of the A, is empty—the set Wi^jAt is non-empty. Since each A, is a *-algebra, we 
can turn into a *-algebra in the obvious way, i.e., by defining scalar multipli¬ 

cation as (A • a){i) = Xa{i), with pointwise addition, multiplication, and involution. 
This *-algebra, denoted by ©,A,, is the algebraic direct sum of the A,. 

What about the norm? There are various options here, each relying on the choice 
of some subspace of ©,A,. Lor example, if Aq consists of all a G for which 

lim, ||a,j| = 0, then the algebraic direct sum ©,A, of the A, is Aq, with norm 

||fl|| =sup||ai||. (C.290) 

i 

Lor the inductive limit we need additional structure, namely a family of homo- 
morphisms (pij: Ai Aj, defined for each i < j in I, such that for each i <j<k, 

(pu=idA,; (C.291) 

(PjkOtPij^tPik- (C.292) 

Such maps turn the family (A,) into a so-called directed system of C*-algebras. 
Lor example, in case of (C.288), and assuming B has a unit 1^ (otherwise there are 
analogous constructions based on projections), for n <m, define (pnm B^ ^ B"' by 

Ifi®---® 1 b- (C.293) 

with m — n units 1^. This can be done also in the more general situation (C.289), 
where we assume each C, to be unital with unit 1,, and define 
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An = ^UCf, (C.294) 

(Pnmic) = C(g) lc„+i 0 • • • 0 lc„- (C.295) 

As a matter of central importance to the theory of quantum spin systems, one 
may generalize this construction in allowing more general directed sets, whilst spe¬ 
cializing it in picking very specific C*-algebras C,. Let C be the standard 
lattice in spatial dimension d, and let I be the set of of all finite subsets A of (so 
one typically writes A instead of /)• Furthermore, take some fixed Hilbert space H, 
assumed finite-dimensional for simplicity (this also suffices for most applications to 
quantum statistical mechanics), and for each A G I, define the cartesian product 

H^ = YIh,, (C.296) 

xeA 


where Hx= H for each x. Thus elements y/ : A -A H of are families (V4 )j:gA5 
where \j/x G H. To define the tensor product 

Ha = ®xeAHx, (C.297) 

we generalize the procedure explained between (C.245) and (C.246) in the previous 
section. If dim(//A) < °° and the injection 


Ha®Hb-^ L{Ha xHb,C), (C.298) 

is an isomorphism, and we use this fact (with Ha = Hb = H) to define Ha as 
L{H^,C), that is, the set of all anti-multi-linear maps \j/ : H^ C, equipped 
with pointwise operations turning it into a complex vector space. Each element 
yr : A -g H of H'^ itself defines such a map t/r G L{H^, C) via 

¥i<p) = Yl{(Px,Vfx)H, (C.299) 

xeA 

through which the inner product on Ha is defined by linear extension of 

iv, 9) Ha = n (C.300) 

xeA 

In this realization of Ha, the elementary tensors ^xgaWx G Ha coincide with the 
above elements yr G L{H ^, C) = Ha ■ Furthermore, if (ui,..., t;„) is a basis ofH = 
C”, then is a basis of Ha, where s : A —>■ {1,... ,n}. Hence 

dim(//A) =dim(//)l"'l. (C.301) 

Furthermore, writing n = {l,2,.and letting nfi^ be the set of maps (“classical 
spin configurations”) s : A ^ n, there is a natural unitary isomorphism 

HA=i^{n^). (C.302) 
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Indeed, as the functions ds : t dst form a basis of the map 5* i— 
extends to a unitary from £^{n^) to Hy\. Under this equivalence, elements of 
may be interpreted as “wave-functions” whose argument is a spin configuration. 
Returning to C*-algebras, having defined Hj\, we now put 


Aa=B{Ha). (C.303) 

To fit this into the above framework, we note that the partial order < on / is given 
by A < A' whenever A C A^, in which case there is a canonical embedding 


(C.304) 

This embedding is given as in (C.293), i.e., by adding unit operators. Let A C A' 
and define A" = A'\A. We may split y/'; A' —^ as y/' (v/|^, W\a")’ which 

(C.305) 

As in (C.298), this gives isomorphisms 

Ha! = L{H^' ,C)^L{H^x H^" ,C) ^ Ha (g) Ha» ■ (C.306) 

This, in turn, induces an isomorphism 

Aa'=B{Ha')^B{Ha^Ha")=B{Ha)^B{Ha») =Aa^Aa», (C.307) 

which, through the embedding 

B{Ha) ^ B{Ha)®B{Ha")\ (C.308) 

a'—>■ a ® 1 b(//^„)) (C.309) 

gives an embedding B{Ha) ^ B{Ha>)- This, then, is the injection (C.304). 

Alternatively, B{Ha) may be constructed just like Ha itself, i.e., by starting with 
the set B{H)^ of functions a: A B{H). Any such a defines an operator d on Ha 
by first defining its action on elementary tensors by dt/f = and extending 

the result linearly to arbitrary vectors in Ha- We write d = ^xeA^x, and reconstruct 
B{Ha) as the complex vector space spanned by all such elementary operators. The 
injection (C.304) is given by linear extension of the map d a', where d'^, = Ux 
whenever x' = x G A C A', and d'., = 1// otherwise, i.e., if x' G A". 

Either way, we obtain a directed system of C*-algebras (Ay^), where the finite 
subsets Adi/ are partially ordered by inclusion, and the maps ^xA' • ^A ^ag 
with properties like (C.291) - (C.292), are given by the inclusions (C.304). 

There is a classical counterpart to this construction, in which the local C*- 
algebras are given by “functions of functions”, i.e., 

A^^ =C{r/)=C{C{A,n)). (C.310) 
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Since is a finite discrete set, any function on it is continuous (and lies in etc.). 
If A C A', then, s' G being a map s': A' n, the connecting homomorphisms 




A' > 


are given quite canonically by 


^AA'(/) • /^a)- 


(C.311) 


(C.312) 


Note that C{rA) = £^{n^) as vector spaces, so that (C.311) also gives natural maps 
^ and hence, via (C.302), Hy\ These are given by linear 

extension of the map given on basis vectors by (EIxgAL. s'/ =i®yGA'Vv'(y)- 
Furthermore, analogously tot (C.307), since A' = A UA" is finite, we have 

A^^, = C(n^') =C(C(A',n)) =C(C(AUA",n)) ^C(C(A,n) xC(A",n)) 

^ C(C(A,n))(g)C(C(A",«)) =C(n^)(g)C(n^") = A^"^ (g)Aj[l. (C.313) 

Given a directed system of C*-algebras (A,, (pij), we define the localpartAioc of 
WiAi as the set of all elements a = (a,) of for which there is /q G I (depending 
on a) such that a,- = (pigi{aiQ) whenever iq < i. This is equivalent to the seemingly 
stronger condition that aj = (Pij{ai) whenever io < i < j, since 


Oj — — ^ij ^ (C.314) 


In the example (C.288) with (C.293), this simply means that for each sequence 
there is no G N such that a„ = a„u \g for each n > hq. Similarly, in 

the example (C.303) with (C.304), for each a = {aA), where A is a finite subset 
of lA and flyv G A^ for each A, there is a finite subset Aq C IA such that for any 
A A Ao we have qa = IaqA (QAo)- ft easy to see that Aioc is a *-algebra under the 
(pointwise) operations inherited from For each (a,) G Aioc, the norms ||a,|| 

form a net in M+. Recall that some net in K (which by definition is indexed 

by a directed set /) is said to converge to f G K if for each e > 0, there is / G / such 
that \t — tj\ < e for all j > i (since K is Hausdorff, any net in M converges to at most 
one point). Because the connecting maps (pij are homomorphisms of C*-algebras, 
they are norm-decreasing (cf. Theorem C.62.1), i.e., ||^,y(a,)|| < ||a,||. Thus for any 
o G Aloe with associated /q G /, the (sub)net (||a(||),>(o lies in the interval [0, ||a!ol|], 
and is monotone decreasing in the sense that if j > i> io, then ll«,ll < II fl,||. As for 
sequences (which are just nets indexed by / = M), bounded monotone decreasing (or 
increasing) nets in R converge, so that net (||flij|)!>(o has a limit, and this also means 
that (||fl!j|),' has the same limit. Call this limit ||a||o. The map a i—>■ ||a||o generally 
fails to define a norm on Aioc, since it may lack the property of positive definiteness, 
and even if it had it, the space would not be complete (at least if I is infinite, as we 
tacitly assume). We do have the C*-axioms ||af>||o < ||a||o||f'||o and ||a*a||o = ||a||o 
though, since these hold for each norm a, i-A ||fl,j| and are preserved in the limit. 
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So we say that ||fl||o is a C*-seminorm on Aioc, and there is a canonical procedure 
to turn a *-algebra with C*-seminorm into a C*-algebra: 

1. Define the null space N C Aioc for || • ||o by = {a G Aioc : ||a||o = 0}; 

2. Define a norm on the quotient A\ac/N by ||fl+A^|| = ||a||o, and complete the 
quotient in this norm. The result is a C*-algebra 

A = lim,A„ (C.315) 

called the inductive limit of the directed system (A,, (pij). 

For each i G I, we now define a canonical homomorphism (pi ; A, — A. If a, G A,, put 
aj = (pij{ai) G Aj if j > i, and aj = 0 otherwise. This gives an element a G Ajoc whose 
image in A\ot^/N C A is <Pi{a). A computation shows that if i <j, then (pj o (pij = (pi. 
Using this fact, it follows that if we put A, = (Pi{Ai) C A, then A, C Aj whenever 
i < A hence A may be rewritten as the norm-closure of the union of the A/, i.e., 

A = ljA, . (C.316) 

i 

In the simple situation where the maps (pij are inclusions and hence isometries, as 
in our examples, we have N = {0}, so that A, = A,, and hence (C.316) simplifies to 

A = ljA,"". (C.317) 

i 

As a case in point, define {Am<Pnm) as in (C.294) - (C.295). The infinite tensor 
product of the C, is then defined through (C.315) and (C.295), i.e., by definition, 

®7=iCi = lim„(g)'LiC; = lJC)"=ic/ **. (C.318) 

n 

Here the first equation is general, and in the second it is understood that for any 
m > n, we have CiJLiQ C through the embeddings (C.295). 

More generally, let {Ax)x£X be a family of unital C*-algebras indexed by an ar¬ 
bitrary set X, and let I = 0^f{X) the set of finite subsets of X, partially ordered by 
inclusion. For any F G I, we have a tensor product 


Af = ^xepAx, (C.319) 

where once again 0 is an arbitrary completed C*-tensor product. An explicit con¬ 
struction of this tensor product along the lines of (C.289) requires an ordering of 
F, but two such orderings give canonically isomorphic C*-algebras; if F C G, one 
should order G compatibly with F for the connecting homomorphisms (ppo to be 
well defined by (C.295). This gives a directed system of C*-algebras {Af,<Pfg), 
whose inductive limit defines the tensor product over A, i.e., 

®xexAx = ^F^xepAx. (C.320) 
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As a special case, we may rewrite our earlier algebras and as 

Aa = ®xeAB{H)- (C.321) 

(C.322) 

(C.323) 

c(n«)> (C-324) 

rixGZ'' ” endowed with the prod¬ 
uct topology, so that (by Tychonoff’s Theorem) the space in question is compact. 
Thus the ensuing inductive limit may directly be expressed as the standard commu¬ 
tative C*-algebra C{X), where X = Y[xeZ‘‘^ compact, equipped with pointwise 
operations and the sup-norm. If n = 2 and d = 1, this is a model of the Cantor set. 
The homomorphisms (pi enable us to state the universal character of A; 

Theorem C.103. Let (A,-, (pij) a directed system of C*-algebras with inductive limit 
A. For any C*-algebra B endowed with a family homomorphisms jS,- : Ai ^ B such 
that pj o (pij = pi, there is a unique homomorphism P : A B such that pj ~ p o (pj. 
In other words, the following diagram commutes: 


= ^xeAC{n)=C 

cf. (C.313). Hence we have 

lim^AA = (g)^^j^dB{H); 

IMaA^a = = 

where in the last expression the infinite product 


A,- 



(C.325) 


Proof This is true almost by construction, or rather by (C.316): since j3 is supposed 
to be a homomorphism of C*-algebras, it is continuous, so it is determined by its 
values on the dense subalgebra Ui A,, and hence by its values on each A, . But these 
values are necessarily given by where a,- G A,-. □ 

Corollary C.104. Let {Ax)xex be a family of mutually commuting unital C*-subalge- 
bras of a unital C*-algebra B (sharing the unit ofB), such that the C*-algebra gen¬ 
erated by all subalgebras Ax within B is equal to B. Also, let ® be some completed 
C*-tensor product such that for each finite subset F = {xi ,... ,x„} GX, there is an 
injective homomorphism (pp : Ap -G B (where Ap = Ax^'^■ ■ ■ ®Ax„) satisfying 

^f(fli(g)---(g)a„) =ai---a„ {ai G Ax^,... ,a„ G AxJ. (C.326) 

Then B = ^xexAx- 

Proof In Theorem C.103, take A,- A^ and Pj <pp. In view of (C.320), this gives 
a homomorphism p : ^xexAx -A- B. Here, this map is an isomorphism. □ 





C.14 Inductive limits and infinite tensor products of C*-algebras 


713 


Finally, we give a result on infinite tensor products of states, needed in §8.4. 

Proposition C.105. Let be unital C*-algebras, and define their infinite (pro¬ 

jective) tensor product (§)^jC,' as in (C.318). For each i G N, let Cd; be a state on C,-. 
Then there is a unique state 0^1 Cd,- on such that for each G N and Ci G C;, 

n 

(§)~ i(d,'(^„(ci(g)---(g)c„)) = ]~[cd,'(Q). (C.327) 

n=\ 

Moreover, (t)^j is pure iff each cOi is pure. 

Proof We write C” = and similarly (t)”^j(d,' = Cd", also for n = o°. 

Eq. (C.327) defines Cd°° on a dense subset of (7°, which proves 

uniqueness. Existence comes from Proposition C.98, according to which the map 
Cl 0 • • • C)c„ I—>■ ri/Li (Oi{ci) extends to a state Cifcd- on C”, which in turn defines a 
state co" on ^„(C”) C C°°. Since (C)”cd')|cm = (gj^Cd- whenever m <n, one also has 
Cd|'^ (C") “ define a functional Cd°° on U„^„(C") by its restric¬ 
tions (O^n = CO". Since of is a state and hence satisfies ||cd”|| = so 

does o'” (on its dense domain). Since the continuous extension of Cd“ to C°° has the 
same norm, this extension (still called Cd°°) is a state by Proposition C.5. 

One direction of the second claim is trivial; if at least one of the Cd, fails to be 
pure, then (d" inherits its convex decomposition so to speak, so contrapositively 
we obtain that purity of Cd" implies purity of each (d,. We first prove the opposite 
direction for n < Using Proposition C.91 and the fact that C" is a completion 
of the algebraic tensor product (8)7^[C,, the GNS -representation TTm" (C") is unitarily 
equivalent to the representation 0 • • • 0 Ttm,, on 0 • • • 0 Ha ,,, and 

(tToj <8 >• • • <8>?r(o„(C”)) =7r(Oj(Ci) 'S>• • •'S>tta^iCn) ■ (C.328) 

Here, for any two von Neumann algebras A and B, A^B is the smallest von Neumann 
algebra containing the algebraic tensor product A®B. The main lemma behind the 
second claim is the nontrivial commutation theorem for von Neumann algebras: 

{A®B)'= A'®B', (C.329) 

which we state without proof. This iterates to n von Neumann algebras. Hence 

( TTfUj 0 • • • 0 (C”)) = TTfflj (Cl) 0 • • • 0 (Cn) , (C.330) 

SO that the claim for n <o° follows from Theorem C.90. 

Now take n = °°, and assume each (d, is pure. Suppose that for some t G (0,1), 

a)“=f(d' + (l-t)cd", (C.331) 

and restrict this equality to (p„{C"). By the previous argument, the restriction of Cd” 

to ^„(C"), which is just Cd", is pure for any n G N. This gives 

This is true for each n, so that Cd' = Cd". Hence (0°° is pure. □ 
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C.15 Gelfand isomorphism and Fourier theory 

One of the most beautiful applications of Theorem C.8 is to commutative harmonic 
analysis. Let G be an abelian locally compact Hausdorff group (e.g., G = K, G = Z, 
or G = T). Such groups have an invariant Hoar measure dx, which satisfies 

f dxLyf{x)= f dxf{x^^)= f dxf(x), (C.332) 

Jg Jg Jg 

for any / € Ce(G) and y GG, where 

Lyfix) = f{y-^x). (C.333) 

This measure is unique up to rescaling; if G is compact, it is normalized such that 
/gc/x = 1. For G = M, this recovers Lebesgue measure on R, whilst for Z and T, 

(dxf{x)=Y^f{n)- (C.334) 

hGZ 

r fin fia 

dxf{x)= (C.335) 

Jt Jo 2.K 

For € Q(G), the convolution product f ^ g is defined by 

f*g{x)= [ dyf{y)g{y^^x). (C.336) 

JG 

Using (C.332), it is easy to verify that this product is commutative and associative. 
Also, one may define an involution on Cc(G) by 

r(x)=/(x-i). (C.337) 

We would now like to turn Cc(G) into a commutative C*-algebra, but the obvious 
norms like the U’-ones do not accomplish this. Instead, for / G Ce(G) we define an 
operator %{ f) on the Hilbert space L^{G) (defined with respect to Haar measure) by 

7t{f)w = f*W, (C.338) 

initially for \j/ G Cc(G). Equivalently, we may write 

^(/) = I dyf{y)Ly, (C.339) 

J G 

where we regard Ly as an (obviously unitary) operator on L^{G), and the integral is 
most easily defined weakly, i.e., 7t{f) is the unique bounded operator for which 

{(p,nif)xif)= I dyf{y){(p,LyXif). (C.340) 

J G 
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Since Ly is unitary, this formula also shows that \{(p,n{f)\l/)\ < ||/||i||||^||||V^||, 
where ||/||i =/e£/x|/(x)|. Taking ^ = ;?r(/)v7gives \\7t{f)\j/\\ < ||/||i|iv7||, whence 

ll^(/)ll<ll/lli- (C.341) 

Hence ;r(/) is bounded and extends from Cc{G) to all of L^{G) by continuity. 
Lemma C.106. The map f n{f)from Cc(G) to B{L?{G)) is injective and satisfies 

n{f*g) = n{f)n{g)- (C.342) 

(C.343) 

Proof. Eq. (C.342) follows from associativity of convolution, and (C.343) follows 
from the last equality in (C.332). To prove injectivity, we fix / G Cc{G), pick e > 0, 
and find a neighbourhood U of e G G such thaty^'x G U implies |/(y) — f{x)\ < e. 
Then, using Urysohn’s Lemma, one may find a positive function xj/jj G Cc{U) such 
that /(y y/j/ = 1. Injectivity of n then immediately follows from the easy estimate 

\f*Wu{x)-f(x)\< [ dy\f{y)-f{x)\-\\i/u{y^^x)\<e. □ 

JG 

Definition C.107. Let G be an abelian locally compact Hausdorff group. The group 
C*-algebra C*{G) is the norm closure of %{Cc{G)) in B{L?{G)), with norm 

\\f\\c^ = Mf)\\B(LHG)y (C.344) 

Since n{Cc{G)) is a commutative *-algebra in B{L^{G)) by Lemma C.106, it is 
easy to see (from joint continuity of multiplication) that its norm closure C*{G) is a 
commutative C*-algebra, whose Gelfand spectrum we wish to compute. 

To this effect, we first define the dual group or character group G of G as 

G = Hom(G,T), (C.345) 

i.e., the set of continuous group homomorphisms from G to T, equipped with the 
compact-open topology. This topology is defined as the restriction to Hom(G,C) of 
the topology on C(G,C) generated by the neigbourhood basis of some 7G G, i.e., 

0{Y,K,e) = {(PGG: | 7 (x)-^(x)| < eVxG/f}, (C.346) 

where K G (G) and e > 0. The corresponding notion of convergence is uniform 

convergence on each compact subset of G; in particular, if G is compact, this is just 
uniform convergence. Equipped with this topology, it can be shown that G is itself 
an abelian locally compact Hausdorff group under pointwise operations, i.e., 

(7i72)W = 71 W 72 W; (C.347) 

7^*(x) = tW; (C.348) 

hence the ensuing unit e in G is the identity function e = Ig in Hom(G,T). 
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Proposition C.108. We have the following examples of dual groups: 


Z ^ T, 

II 

(C.349) 

M = M, 

7p(x)=.'>^; 

(C.350) 

f ^ Z, 

II 

(C.351) 

Zp = Zp, 


(C.352) 


Here Zp = 'LI{p- Z) is the (finite) group of integers mod p. 

Proof For (C.349), any character 7 : Z —>■ T is determined by its value 7 ( 1 ) = z, 

since for n > 0 we have "/{n) = 7(1 H-hi) = 7 ( 1 )” = z", where the sum has n 

terms; for n < 0 , we obtain the same result from 7 (n) = 7 (—n)^* = (z^")^' = z”. 

To prove (C.350), we need to solve 7 (x + y) = 7 (x) 7 (y) with 7 ( 0 ) = 1, where 
7 ; M —T is continuous. To see that (C.350) gives all solutions, find e > 0 for which 
fo dyY(y) = a > 0; this is possible, since 7 ( 0 ) = 1 and 7 is continuous. Then 

ps ps pS~\~X 

/ dyY{y)rix)= dyY{x + y)= dyYiy), (C.353) 

Jo Jo Jx 

so that 7 is differentiable, with, writing 7 for dY/dx, 

aY{x) = 7 (e +x) - y{x) = ( 7 (e) - l) 7 (x). (C.354) 

Hence fix) = c 7 (x) with c = ( 7 (e) — l)/a, so that 7 (x) = exp(cx). Since | 7 (x)| = 1, 
this forces c = ip for some p G K. This also implies (C.351), since T = M/Z and 
hence the characters of T are those characters of K that map Z to 1. Similarly, 
(C.352) follows from (C.349); the characters on Z that are trivial on p • Z take the 
form 7 (n) = z" for some p-roots of unity z = exp( 2 ;r/m/p), mG{l,...,p}. □ 

Theorem C.109. Let G be an abelian locally compact Hausdorff group. Then the 
Gelfand spectrum Z{C*{G)) is homeomorphic to G, and the Gelfand isomorphism 

C*(G)^Co(G) (C.355) 

is given on the dense subspace CdG) C C* (G) by the generalized Fourier transform 

/(7)= [ dxYix)fix). (C.356) 

JG 

Thus the Fourier transform is a special case of the Gelfand transform (which is 
noteworthy if only because Gelfand himself promulgated the unity of mathematics). 

Proof We will prove that each character 7 G G on G defines a character (Oy on 
C*(G) by continuous extension (i.e., from its dense subspace Cc(G) to C*(G)) of 

®r(/)=/(r), (C.357) 

as in (C.356), and that the map 71 —Wy gives a homeomorphism G ^ E(C*(G)). 
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It follows from simple computations that for f,g G Cc{G), one has 

(Oy{f*g) = COy{f)COy{g)-, (C.358) 

®r(r) = oj^. (C.359) 

To finish the proof, we need three further nontrivial facts about the map 7 (Oy\ 

1. It is surjective, i.e., if O) G Z(C*(G)), then (0 — (0y for some 7 G G. 

2. It is injective, in that /( 7 ) = /(/) for all / G Cc(G) implies 7 = /. Moreover, 
the character (Oy initially defined on Cc(G) by (C.357) will be shown to satisfy 

|®y(/)|<||/||c*, (C.360) 

Thus (Oy : Cc{G) -G C may be extended to C* (G) by continuity in the usual way. 

3. The compact-open topology on G is mapped to the Gelfand topology on Z (C* (G)) 

To prove the first point, we restrict a character (O : C*(G) —C to Cc{G) and note 
that because of the bound (C.341), this restriction in turn extends to an element of 
L^(G)*, which we still call (O. Entry 10 in Table B.l gives L^{X)* = LZ’iX), in the 
sense that any (p G L^{X)* is given by (pf{g) = Jxfg for some f GL°°{X). Hence 

®(/) = [ dxcb{x)f{x), (C.361) 

JG 

where ® G L°°{G). The multiplicative property (o{f*g) = (o{f )(o{g) then gives 

®(xy) = ®(x)w(y) (C.362) 

almost everywhere (a.e.) with respect to Haar measure. 

To prove continuity of S), compare the following expressions with f,gG Cc{G): 

(0if)(0{g) = (o{f) / dx6){x)g{x)-, 

JG 

(oif*g) = f dx(o{LJ)g{x). 

JG 

These must coincide, so if we pick some / G Cc(G) for which (o{f) ^ 0 (which is 
possible since Cc(G) is dense in C* (G) and (O is not identically zero), then we obtain 

Gi{x) = co(LJ)/co{f), (C.363) 

almost everywhere. Hence we may redefine S) by (C.363) for all xGG. Since 

|®(L,/)-cu(Ly/)| < \\LJ-Lyf\\c* < \\LJ-Lyfh<C\\LJ-Lyf\\^, (C.364) 

recalling that / has compact support, it follows that the function x i-G (o{Lxf ) is 
continuous, whence also 6) as redefined by (C.363) is continuous. 
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We now show that S){x) G T. If |dj (x)| > 1 , then S) cannot be bounded (whereas 
we know it lies in L°°(G)), because ®(x”) = ©(x)” by (C.362). But the same is true 
if I® (x)| < 1, because using ®(x = G){x) * (which follows from (C.362) and 

(C.363), which gives 6){e) = 1), the same argument applies with instead of x. 
Thus ® : G —T is a character 7 G G (where the bar is conventional), so that (C.361) 
turns into (C.356). As to injectivity, if /(y) = /(/) for all / G Cc(G), then 

[ dxiW)- )/W = 0, (C.365) 

JG 

for all such /, which by standard integration theory gives / = 7 a.e. and hence 
everywhere, since both functions are continuous. To prove (C.360), we use a trick: 
take some fixed Wq G Z(C*(G)), so that COo(f) = /(Yo) for some 70 G G by the 
previous step of the proof, and ||cod(/)|| < ||/||c* for all / G C*(G). For 7 G G and 
/ G Q(G), eqs. (C.356) and (C.347) give (Oy(f) = ®o(77o/), where 770 / is the 
pointwise product of the three given functions from G to C. Hence 

l®r(/)l = l®o(77o/)l < l|7r(77o)/ll = ll77o/llc*- (C.366) 

We now denote 770 by /, which lies in G, and note that for any 7 ' G G, we have 

{(p,n{Y'fW) = {Y'(pMfWv)) ((p,rGL2(G),/GC,(G)). (C.367) 

Taking (p ~ and using Cauchy-Schwarz as well as || 7 ^^|| = ||^||, gives 

MY'fM<Mf)Y'v\\ (rGf^"(G),/GC,(G), 7 'GG). (C.368) 

Taking the sup over all Xj/ G L?{G) with || v/|| =1 (which also means || y' v/|| = 1) gives 
||^(//)|| < ll^(/)ll- Combined with (C.366) and (C.360), this gives the bound 

|®r(/)l<ll/llc*- (C.369) 

We now prove continuity of the map ®y -G 7 from E{C*{G)) to G (using se¬ 
quences for simplicity, the argument for nets being similar). If (Oy, i.e., 

/(7h) —>■ fiy) for each / G C*(G), and hence for each / G Ce(G), then Jn ^ Y 
uniformly on any K C YXf (G). Writing yii — YnY g = fy, we first notice that 

| 7 „(x)- 7 (x)| = |)/(x)-l|; (C.370) 

/(7«)-/(7) =!(/«)(C.371) 

This shows that we may reduce the proof to the case 7 = IgI otherwise, simply 
change 7 „ to y'^. Thus we assume that /( 7 „) —>■ /(1g) for each / G Cc(G). We now 
pick some fixed g G Cc(G) such that g(lG) = = 1- For e > 0, by uniform 

continuity there is a neighbourhood U of the identity e G G such that, cf. (C.364), 

l|f^«^-^lli <e/3 («GG). (C.372) 
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Then \Jx^gxU covers G, and hence also covers each compact set K <zG. Therefore, 
K has a finite subcover UygyXyf/. Define gj = L^jg. By invariance of the Haar mea¬ 
sure, we have ^^(Ig) = so that by definition of a)y„ — >■ (Oi^, we may find N gN 
such that for each j G J and for all n > N,we have 

||,-(7 „)-l|<e/3 . (C.373) 

Also, if xGK, then x = XjU for some j GJ and uGU . Eq. (C.372) then implies 


ll./(7«)(7«W-l)l = 


/ dyiLug{y)-g{y))yn{y) 
Jg 


< e/3. 


(C.374) 


Hence for any K G (G) and x G K as above, we may estimate, for all n > N, 


|7«W-1| < |7«W(1-|;(7«))I + II./(7«)(7«W-1)I 

+ |li( 7 «)-l| <e/3 + e/3 + e/3 = e. (C.375) 

Consequently, /( 7 „) —>■ /(1g) for each / G Cc{G) implies 7 „ —>■ 1g in G; as we have 
argued, this proves continuity of the bijection E (C* (G)) —G given by (Oy -G- y. 

If E {C* (G)) and G are compact (which is the case iff G is discrete, in which case 
C* (G) has a unit 5e) we are ready, since a continuous bijection from a compact space 
to a Hausdorff space has a continuous inverse, and hence is a homeomorphism (in 
our case, both spaces are compact as well as Hausdorff). In general, continuity of the 
map yt-G (Oy from G to Z (C* (G)) almost immediately follows from the definition of 
the compact-open topology on G: if 7h —?► 7 in this topology (similarly for nets), and 
/ G Ce(G), then /( 7 „) —>■ f{y), and hence (Oy^{f) —>■ (Oy{f)- A simple e/3-argument 
then gives the same result for f GC* (G). □ 

Note that local compactness of G (though provable directly) also follows from this 
theorem, since we know this for the Gelfand spectrum i;(C*(G)), cf. Theorem C.45. 

Beside the Gelfand isomorphism (C.355), in which the two function spaces 
C*(G) and Co(G) are of a different type, there exist more symmetric versions of 
the generalized Fourier transform (C.356). In the setting of Banach spaces (as op¬ 
posed to spaces of distributions, which would take us into the territory of locally 
convex topological vector spaces, and hence outside the scope of this appendix, 
though cf. §5.11), there are (at least) two natural possibilities. The traditional and 
most familiar one is provided by the Hilbert spaces L^{G) and L^{G), defined with 
respect to suitably normalized Haar measures dx (on G) and dy (on G), respectively. 
A second, more recent possibility is to use the following two Banach spaces. 

Definition C.llO. The Banach space Cq(G) is the completion ofCc(G) in the norm 

||/||o=max{|l/||^, 11/114. (C.376) 

Similarly, the Banach space Cq(G) is the completion of Cc{G) in the norm 


llCllo = max{llC4,llClloc}- (C.377) 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



720 


C Operator algebras 


It follows that Cq (G) can be norm-decreasingly injected into both C* (G) and Co(G), 
so that Cq(G) is a subspace of Co(G) as well as of C*(G). By (C.341) and (C.360), 

L'(G)nCo(G)cCS(G), (C.378) 

and similarly for Cq(G). Indeed, Cq(G) (and likewise Cq(G)) could equivalently 
have been defied as the completion of (G) nCo(G) in the norm (C.376). 

Theorem C.lll. The Fourier transform (C.356) induces isometric isomorphisms 

(C.379) 

CS(G)^CS(G), (C.380) 

such that, on suitably normalizing dx and dj, the Fourier inversion formula 

f{x)= (jyy{x)f{y), (C.381) 

JG 

cf. (C.356), in both cases holds verbatim whenever f & L} (G) and f & L} (G), in 
which case f and f are continuous, and (C.356) and (C.381) holdpointwise. 

The Fourier inversion formula (C.381) is actually equivalent to its special case 

f{e)= idyfiy), (C.382) 

JG 

where e G G is the unit, since (C.381) follows by substituting L^-if for / and using 

I^f=yf- (C.383) 

It is also important to realize that conceptually, the inversion formula (C.381) reads 

/(x)=/(x-i), (C.384) 

where the Fourier transform for suitable : G —C is defined, as in (C.356), by 

C{x)= l.dyx{y)C{y)- (c.385) 

JG 

Here j : G —T is some character on G, i.e., X G G, and we have a natural map 

G^d; (C.386) 

X x; (C.387) 

x(7) = 7(x). (C.388) 

Pontryagin duality states that (C.386) - (C.388) define an isomorphism, i.e., 

(7 =2? G. (C.389) 
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We omit the lengthy proof of this beautiful isomorphism of topological groups 
(cf the examples in Proposition C.108), and turn to the proof of Theorem C.l 11. 

Proof. First, we (re)construct a correctly normalized Haar measure on G by defining 

f : Cc(G,K)^K; (C.390) 

Jd 

C inf{/(e) I / G Co(G),/ > C (pointwise)}. (C.391) 

This map takes values in M, since if / is real, as required by / > in (C.391), 
then, noting that the Gelfand (= Fourier) transform on Cq(G) maps the involution 
(C.337) into complex conjugation on Co(G), so is /(e), cf.(C.337). Furthermore, 
is linear, as well as positive: if C > 0 (i.e., pointwise), then also / > 0 in Co(G), so 
that / > 0 in C* (G), because by Theorem C. 109 the map / 1 —/ is an isomorphism, 
which by Theorem C.52 preserves positivity. This gives {wT'^{f)w )® 
all y/ G which by a simple continuity argument (in a proof by contradiction, 

using the inclusion Cq(G) C Co(G)) enforces /(e) > 0, and hence inf{/(e)} > 0. 
By Theorem B.19, there is a measure djon G defining the integral /^, i.e, 

/ dyi;{y) = inf{/(e) | / G CS(G),/ > C}, (C.392) 

JG 

where initially ^ is real-valued, upon which the integral is extended to Ce(G) by 
complex linearity, as usual in (Lebesgue) integration. The point is that the measure 
dy is translation invariant and hence is a Haar measure on G: indeed, replacing g 
by Lyg amounts to replacing / (as a function that satisfies / > g) by //. Invari¬ 
ance then follows from /(e) = 1 for any character / G G, which obviously implies 
(//)(e) = /(e)/(e) = /(e). The Banach spaces LP{G) andLP(G) are then defined 
with respect to dx on G (assumed given) and dyowG (as above), respectively. 

Furthermore, the proof uses an approximate unit (5j/) of C* (G) that lies in Ce(G) 
and is indexed by shrinking neighbourhoods U of e G G. More precisely, take the 
directed set of all symmetric neighbourhoods of e (i.e., = U), ordered by re¬ 

verse inclusion D, take positive functions hu G Cc(W) for some neighbourhood W 
of e satisfying C U, normalize % such that /g/ic; = 1, and define 

5u=hu*hl- (C.393) 

fu=f*du{fGC*{G)). (C.394) 

We will show that for each f GC* (G), we have 

lim||/c/-/||c*=0. (C.395) 

To this end, we first show that ||5j/||c* < 1, which follows from the estimate 

ll^(5c/)r||= f dy5uiy)Ly\i/ < [ dy5u{y)\\Ly\i/\\ = \\\i/\\. (C.396) 

J G J G 
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Similar estimates give f*5u^f for / € Cc{G), so that finally 

\\f*5u-f\\c* = \\f *Su-g*5u+g*5u-g + g-f-f\\c* 

<nf-g\\c* + \\g*Su-g\\c*- (C.397) 

Taking g G Cc{G), an e/3 argument finishes the proof of (C.395). Moreover, 

fuGC*o{G)ifGC*iG)). (C.398) 

To prove this, take g,hG Cc{G). Regarding g and h as elements of L?{G), note that 

= (C.399) 

SO that Cauchy-Schwarz and unitarity of L^-i give ||g*/z||oo < ||g||2||/^||2- Applying 
this with g 7t { f)g and h ^ h, where f GC* (G), g G Cc(G), and h G Cc(G), yields 

\\f*g*h\\^ < Mf)g\\ 2 \\h \\2 < Il/||c*lkll 2 ||/*|| 2 ; (C.400) 

\\f*g*h \\2 = \\7tif)ig*h)\\2 < \\f\\c48*h\\2. (C.401) 

Eq. (C.401) will be applied later, in the proof of (C.379). Eq. (C.400) shows that if 
/« ~^ / in C* (G) for some net (/„) in Cc(G), then fn*g*h^f*g*h uniformly, so 
that f*g*h G Co(G) and f„*g*h^f*g*h in Co(G). Also, 

\\f7j4h\4 = \\fgh\4 < ll/lloolll^lloo = ||/||c*|ll^ll~, (C.402) 

by isometry of the Gelfand transform, so that also fn*g*h^ f*g*hinCo{G). If 
fn,g,h G Cc(G) then fn*g*h G Cc{G) C Cq(G), and the above computations give 
fn*g*h^f*g*h in Cq(G). This shows that f*g*h G Cq(G); taking g = hu and 
h = hlj yields (C.398). 

We now turn to the Eourier inversion formula (C.381). Since the Gelfand trans¬ 
form C* (G) -G- Co(G) is an isomorphism, for any G Co(G), we can find f GC* (G) 
such that / = we can find a net /j/ = f*5u in Cq (G) such that 

limll/t/ - /lie* = lim \\fu-f\\c,iG) = 1™ \\fu-f\4 = 0. (C.403) 

If G Cc(G), we in addition have fu^^in Co(G), or, equivalently, 

lim ll/e-/|lc*(G)=0- (C-404) 

Eq. (C.403) and the fact that Bu is continuous, and hence uniformly continuous on 
every compact K C G (which we take such that it contains the support of / = Q, 
gives lime/ ll^e — 1||~ — 0, where ||t7||io is the supremum of |J7(7)| over all jgK. 
Eor / G Cc(G), with fu = Buf, this gives /[/—>■/ in L^{G). As we trivially have 
||/|lc*(G) — ll/llz,*(G) similarly, of course, on G itself), we obtain (C.404), which 
together with (C.403) also yields /e —>■ in Cq(G). 
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Since fjj G Cq(G), the infimum in (C.392) is saturated, and hence the Fourier 
inversion formula (C.381) holds for fu. Pontryagin duality then yields isometry, 
i.e., ||/c/||o = ll/t/llo- Convergence of fu in Cq(G) therefore yields convergence of 
fu in Cq(G), necessarily to /, since we already knew that /j/ —>■ / in C*(G), cf. 
(C.395). This shows that / G Cq(G), so that (C.381) holds for /, implying 

ll/llo = ||/||o. (C.405) 

Thus the Fourier transform ^ : C*(G) Cq(G) from Theorem C.109 is given by 
continuous extension of ^{f) = / as defined by (C.356), where / G Cc{G). 

To prove (C.380), let B{G) be the set of all f GCq (G) for which / G Cc(G), and 
let B{G)^ be its closure in Cq(G). Then ^ restricts to an isometric isomorphism 
B{G) -G Cc{G), and hence also to an isometric isomorphism B{G)^ -G Cq(G); we 
recall that (by definition) Cq(G) is the completion of Cc(G) in its norm || • ||o. 

Repeating this construction for G instead of G, and using Pontryagin duality 
(C.389) with the ensuing isomorphisms C* (G) = C* (G) etc., we also have a Fourier 
transform ^ : C*(G) —Co(G). Since the Fourier inversion formula (C.381) holds 
on Cc(G), we see that ^ maps Ce(G) isometrically to B{G) and hence by continuity 
maps Cq (G) to B{G)^. At the same time, ^ maps B{G) (defined, mutatis mutandis, 
like B{G)) to Ce(G), and hence maps B{G)^ to Cq(G). Since B{G)^ C Cq(G), this 
implies B{G)- = C^{G) and B{G)- = C^{G). This proves (C.380). 

Returning to (C.381), we know from the above analysis that (C.356) and (C.381) 
hold if / G Cq (G) and / G Cc{G). If / G L' (G), then, by Lebesgue integration theory, 
eq. ,^(/) remains given by (C.356). If also f GL} (G), then f GL} (G) nCo(G) and 
hence / G C^(G), cf. (C.378). By (C.380), there exists / G C(;(G) such that / = / 
in C* (G), and hence for a.e. x G G (with respect to Haar measure), we have 

f{x) = lim f*5u (x) = lim f *5u{x) = f {x) . (C.406) 

It follows that f = f a.e., and so the inversion formula (C.382), and hence (C.381), 
holds, provided (if necessary) / is replaced by its representative /. 

Finally, to prove (C.379), take / = y/ in (C.356) in Cc(G), so that we may com¬ 
pute 

Ilrll 2 = I dx\\i/{x)\^ = \i/*\i/*{e)= [ drij/^*{Y)= [j7\v\^ = \\v\\i 

Jg Jg Jg 

(C.407) 

We may therefore extend initially given by J^{f) = /, from Cc(G) to its com- 
petion L^{G) in || • || 2 . Second, we prove surjectivity similarly to the previous part: 

Pick G Ce(G), and hence / G C*(G) with f = C- Then fu= f *du G L^{G), 
as follows from (C.401). Then /j/ —>^ / in L^{Cj), since analogously to the previous 
proof, we find that {fu) is a Cauchy net in L^{Cj). By isometry of (as just proved), 
this implies that {fu) is a Cauchy net in L^{G). Let fu ^ g 'vx L^{G)', continuity of 

gives ^{g) = making ^ surjective at least onto Cc(G). Since L^{G) is the 
completion of Cc(G) in the L^-norm 11-112, the Fourier transform : L^{G) -gL?'{Cj) 
is an isometric surjection, and hence is unitary. □ 
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We close this section with the SNAG-Theorem (named after Stone, whose Theo¬ 
rem 5.73 it generalizes, Naimark, Ambrose, and Godement, each of who published 
versions of it in 1944). This theorem uses projection-valued measures, which we 
have avoided so far, but which are appropriate here as well as in our application of 
the SNAG-Theorem to the Goldstone Theorem 10.28. Recall that the Riesz-Radon 
representation theorems B. 19 and B.24 establish a bijective correspondence between 
states on Co{X) and probability measures on X. There is a similar correspondence 
between representations of Co(X) and projection-valued measures on X. Cf. §B.4. 

Definition C.112. LetX be a set with o-algebra X C 73^(A), and H a Hilbert space. 
A projection-valued measure/or {X,E,H) is a map e : E ^ d^{H) such that for 
each unit vector xf/ G H, the map : A —>■ [0,1] defined by 

e^'>'\A) = {w,eiA)w), (C.408) 

is a probability measure. Equivalently, e(0) = 0//, e{X) = 1//, e{Ar\B) = e(A)e(B), 
and e(U«A„) = Y^n^iAn) far pairwise disjoint An in the strong topology on B{H). 

The simplest example must be // = L^{X,E,p.) with e{A) = 1^, cf. §B.6. 

As in (B.328), one can integrate any bounded measurable function f : X ^ C 
“against” e, i.e., there is a unique operator Jxdef such that for any e > 0 there is a 
finite partition X = Ui=i^i of ^ into n Borel sets A,, such that for any x, G A,, 


/ def-Y,f{xi)e{Ai 


< e. 


(C.409) 


Analogously to the Riesz-Radon representation theorem, one may then prove: 

Theorem C.113. Let X be a locally compact Hausdorjf space. There is a bijective 
correspondence between non-degenerate representations 7t : Co(X) ^ B{H) and 
projection-valued measures e for {X,E,H) (where E is the Borel (7-algebra), viz. 


= f def- 
Jx 

e{A) = n{lA), 


(C.410) 

(C.411) 


where %{Ia) is defined by extending % from Co(A) to the C*-algebra ^(X) of 
bounded Borel functions onX (cf. Theorem B.102 and Proposition B.98). 

We finally need the existence of a bijective correspondence between continuous 
unitary representations m of G and non-degenerate representations of C*(G) given 
by (C.506) in §C.18 below; see the comment below Definition C.119. Combined 
with Theorems C.109 and C.l 13, we then obtain the STSIAG-Theorem: 

Theorem C.114. There is a bijective correspondence between continuous unitary 
representations u of a locally compact abelian group G on some Hilbert space H 
and projection-valued measures e : 3§{G) -A on the dual group G, such that 


u 


{x)= Lde{Y)Y{x). 
JG 


(C.412) 
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C.16 Intermezzo: Lie groupoids 

Groupoids generalize groups, group actions, and equivalence relations. As such, 
they provide a more flexible language for dealing with symmetries than either of 
these. Like Lie groups, one also has Lie groupoids, which form an important tool in 
constructing continuous bundles of C*-algebras (see §C.19 below). These, in turn, 
provide the mathematical foundation of (deformation) quantization, see Chapter 7. 

Definition C.115. A groupoid G = {Gi,Go,s,t,i,I) is a small category (i.e. a cat¬ 
egory in which the underlying classes are sets, cf. %E.l} in which each arrow is 
invertible. Thus one has a set (of arrows) Gi doubly fibered over some base space 
Go through source, and target maps sf:Gi —>■ Gq. These maps define the set 


G 2 = {(x,y) G Gi X Gi I i(x) = t{y)} (C.413) 

of composable pairs, on which a multiplication m : G 2 —>■ Gi is defined, which we 
simply denote by xy = m{x,y), subject to the axioms 

s{xy) = s(y); t{xy) = t{x) (xy G G 2 ); (C.414) 

{xy)z = x{yz) (xy G G 2 ,yz G G 2 ), (C.415) 

the third being well defined by virtue of the first and the second. 

Furthermore, there is an object inclusion map /: Go ^ Gi, m i-G- id„, satisfying 

s(id„) = f(id„) =u (uG Go); (C.416) 

= id,(^)X = X (x G Gi). (C.417) 


Finally, what makes a (small) category a groupoid is the existence of an inverse 

/ : Gi —>■ Gi, XI—^-x^*, 

satisfying 

s(x^^) = f(x); f(x^')=s(x) (xGGi); (C.418) 

x^'x = idj(;t); xx^^ = idr(jc) (x G Gi). (C.419) 

A Lie groupoid is a groupoid for which Gi and Go are manifolds, s and t are 
surjective submersions, and multiplication and inversion are smooth. 

We often identify u with id„, so that x^'x = s(x), etc. We allow manifolds with 
boundary, which provide key examples; cf. Proposition C.l 17 below. 

Proposition C.116. In a Fie groupoid, object inclusion is an immersion, inversion 
is a dijfeomorphism, G 2 is a closed submanifold of Gi x Gi, and for each u G Go, 
the fibers s^^{u) and (m) are submanifolds ofGi. 

Abusing notation, Gi is often called G. Some basic examples of Lie groupoids are: 
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• Lie groups G, where G\ = G and Go = {e}, e G G being the unit. 

• Manifolds M, where Gi = Gq = M with the obvious trivial groupoid structure 

s{x) =t{x) = idx = =x, and xx = x. 

• Pair groupoids over a manifold Go = M, where Gi = M x M and s{x,y) = y, 

t{x,y) =x, = {y,x), {x,y){y,z) = {x,z), and id^c = (x,x). 

• Smooth equivalence relations, i.e. immersed submanifolds of M xM. 

• Action groupoids F xM, which are defined by a smooth (left) action F OM of 

a Lie group F on a manifold M, where Gi = F x M, Gq = M, s{g,m) = g^^m, 
t{g,m)^m, and = {gh,m). 

• Vector bundles n : E ^ M over a manifold Go = M, with s = t given by the 
bundle projection n, object inclusion M ^ E as the zero section, multiplication 
as as fiberwise addition of tangent vectors, and inverse 

Any Lie groupoid G defines an associated tangent groupoid G^, which will play 
a crucial role in §C.19. We first explain the (surprising) underlying differential ge¬ 
ometry in three steps of increasing complexity. We start with the manifold M = K”, 
with tangent bundle TM = Our goal is to describe a smooth structure on 

F = TMU{Q,\\xMxM, (C.420) 

seen as a bundle over [0,1], where (as the notation already indicates) the fibers are 

Fo = TM-, (C.421) 

Fn=MxM {n>Q). (C.422) 

Although each fiber Fp, of this bundle is isomorphic to its smooth structure is 
not equal or even diffeomorphic to the usual one on [0,1] x Instead, we define 

0 : [0,1] X mmu(0,1] xMxM; (C.423) 

0(0,^) = ^; (C.424) 

= {h,exp^ih^)) {h>0), (C.425) 

where the symmetrized (“Weyl”) exponential map exp’^ : TM -G M xM is given by 

exp'^ (x,v) = (x—jVjX+jv). (C .426) 

Here the coordinates (x, v) of ^ G TxM denote ^f{x) =J^iV‘ M = 'Liv'dif{x). 

Like its more familiar counterpart (x,v) i—>■ (x,x +v), exp*^ is a diffeomorphism. 

For M = M", our map 0 is a bijection, with inverse given by 

0-i(x,v) = (0,x,v); (C.427) 

r\hx,y) = (^>0)- (C-428) 

We use this to transfer the product topology (and also the smooth structure as a 
manifold with boundary) from [0,1] x TM to F. Then a sequence {hn,Xn,yn) in F, 
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where hn —>■ 0, converges iff Xn ^ x, yn ^ x for some x G M, and (y„ — Xn)/h„ —>■ v, 
in which case —>■ (0,x, v). More abstractly, F has two key properties: 

1. The map F —>■ [0,1] xM xM given by 

(x,v) I—>■ (0,x,x); (C.429) 

{h,x,y) i-G {h,x,y) {h>0), (C.430) 

is smooth. Indeed, as a map [0,1] x TM —>■ [0,1] x M x M, this map is given by 

(0,x,v) I—>■ (0,x,x); (C.431) 

{h,x,v) I—>■ {h,x— jhv,y+ \hv). (C.432) 

2. For any / G C°°{M x M) that vanishes on the diagonal 

A{M) = {{x,x)\xGM} dMxM, (C.433) 

the function 5f on F defined by 

8f{x,v) = ^j_f{x,x); (C.434) 

5mx,y) = f{x,y)/h {h > 0), (C.435) 

where the tangent vector G r(^^)(MxM) has components (—Iv, jv), is 
smooth. Indeed, as a function on [0,1] x TM, the pullback S*f = Sf o(p is given 
by 

Sy(0,x,v) = ^^f(x,x); (C.436) 

5*f{h,x,v) = f{h,x- \hv,y+\hv)/h, (C.437) 

which is smooth given our assumptions on /. 

A similar construction works for any (smooth) manifold M, except that the 
smooth structure on F may no longer be definable in terms of a single map (j). In¬ 
stead, we invoke a special case of the well-known tubular neighbourhood theorem 
of Riemannian (or, more generally, affine) geometry, which states that M, identified 
with the zero section in its tangent bundle TM, has an open neighbourhood U such 
that the (symmetrized) exponential map exp'^ : U ^ M x M is a diffeomorphism 
onto its image. Here exp'^((®) = (7(— 5 ), 7 (j)), where ^ G T^M and 7 is the unique 
affinely parametrized geodesic with 7 ( 0 ) = x and 7(0) = ^. We now replace the 
space [0,1] X TM used in the special case M = K" by the pair of spaces 

Vi ={{h,^)G [0,1] xTM\h^GU}; (C.438) 

y 2 = (0,1] xMxM, (C.439) 

with associated maps : Vi —F and (j) 2 :V 2 ^ F defined by 
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(C.440) 

(C.441) 

(C.442) 


Then 0i and 02 injective and, writing /j = (Pi{Vi), we have F =Fi UF 2 , which is 
far from a disjoint union; let/jy = FiDFj. Also, let Vij = {a G V,-1 0,(a) G Fij}, with 
associated maps 0,,- = 0r* o 0,: Vij —^ V/,. We now define the smooth structure of F 
by declaring /: F — >■ R to be smooth iff fi : Vi — >■ M is smooth, i= 1,2, where fi 

is the restriction of / to /j. These conditions are compatible on the overlap Fij, since 
012 (^,<^) = exp^(h^) is a diffeomorphism (with inverse 02i). This smooth structure 
may also be defined by imposing conditions 1 and 2 above, mutatis mutandis. In 
particular, (C.434) should now read 

5/(^) = ^±/(ax); = (C.443) 

A more general form of the above construction, which will be used to generate a 
vast class of continuous bundles of C*-algebras, is as follows. Let M be a closed 
submanifold of another manifold G (in the above situation we take G = M x M and 
identify M with A (M)), and replace TM above by the normal bundle 


NmG=TmG/TmM, (C.444) 

i.e., the quotient of the restriction TmG of the tangent bundle TG to M C G by its 
subbundle T^M '^TM', hence the fiber of NmG at x G M C G is TJGjTJd. 

In the above case G = M x M, one therefore has 

Nm{M X M) ^ TM, (C.445) 

through the isomorphism [(^ 1 ,(^ 2 )] where (<^ 1 , 1 ^ 2 ) G x M) 

and [((* 1 , 1 ^ 2 )] is its equivalence class in the quotient (M x M)lT{xV) i^))- 
Other easy examples are Lie groups G, for which NmG = TgG = p is just the Lie 
algebra of G ( at least as a vector space), and G — M, for which NmG = M. 

For the bundle F, defined over I = [0,1], we take the fibers and total space as 


Fo = NmG; 

(C.446) 

Fn = G{h>Q)-, 

(C.447) 

F = NmGU{Q,1]xG. 

(C.448) 


Once again, there are two equivalent ways to define a smooth structure on F. The 
first uses a more general version of the tubular neighbourhood theorem from differ¬ 
ential geometry, which states that M C NmG (seen as its zero section) has an open 
neighbourhood U that is diffeomorphic to some open neighbourhood U' of M C G 
via a diffeomorphism (p that maps M to itself (i.e., pointwise). Then put 
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Vi = G [0,1] xA^mG I G t/}; (C.449) 

V2 = (0,1] X G, (C.450) 

again with associated maps : Vi —?> F and ^:V 2 ^ F, this time defined by 

0i(O,^) = <^; (C.451) 

0i(fi,<^) = (p(fi^) (fi>0); (C.452) 

<p 2 {h.,n) = {h,n) {h>Q). (C.453) 


One then proceeds exactly as above. Equivalently, we impose that: 

1. The map F [0,1] x G, defined at = 0 by ^ i-G- (0,x), where ^ G NmG (where 
X G M C G), and {h,n) (fi,n) for > 0 and n G G, is smooth. 

2. For each / G C°°(G) that vanishes on M, the function Sf on F defined by 

Sf(^) = ^/; (C.454) 

df(h,n)=f(n)/h(h>0), (C.455) 

is smooth (note that ^fis well defined despite the fact that ^ G TmG/TmM rather 
than G TmG, since any two representatives of ^ in TmG differ by vectors in 
TmM, which vanish on / because /[^ = 0 by assumption). 

After this preparation, we are at last in a position to define tangent groupoids. 

Proposition C.117. Any Lie groupoid G over some base space Gq = M defines an 
associated tangent groupoid G^, with total space G^ = F, cf. (C.448), with smooth 
structure as explained, base space Gq = [0, 1] x M, source and target projections 



(C.456) 

s^{h,x) = {h,s{x)) {h > 0 ); 

(C.457) 

t^{h,x) = {h,t{x)) {h > 0 ), 

(C.458) 

where % : TmG/TmM —>■ M is the bundle projection, and x G G, multiplication 

• 7? = <^ + 7] (fi = 0 ); 

(C.459) 

{h,x)-{h,y) = {h,xy) (fi> 0 ), 

(C.460) 

and inverse 


= (^ = 0 ); 

(C.461) 

{h,x)^^ = {h,x^^) {h > 0 ). 

(C.462) 


In other words, G^, seen as a bundle over [0,1] is a “bundle of groupoids”: the 
groupoid above = 0 is the normal bundle n : NmG M, as in the vector bundle 
example above, whereas the fibers above > 0 are G itself. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



730 


C Operator algebras 


C.17 C*-algebras associated to Lie groupoids 

One may associate two C*-algebras to a Lie groupoid G, called C*{G) and C*(G), 
which coincide for abelian Lie groups, and as such generalize the construction in 
§C.15, cf. (C.332) and (C.336) - (C.337). We first generalize the Haar measure. 

Definition C.118. A Haar system on a Lie groupoid G is a family of measures 
,u € Go), where /r“ is defined on the t-fiber 

G"=r'(M), (C.463) 

where it is locally equivalent to Lebesgue measure, and each function 

u^ [ dp^fifGCfiG)) (C.464) 

Jg“ 

on Go is smooth. A Haar system is left-invariant if for each f G C~(G) and x G G, 
I tfM'W (y) f{y) = / , ^ (y) /(yy) ■ (C.465) 

It is sometimes convenient to regard /r“ as a measure on all of G but having support 
in G“. Either way, any Lie groupoid possesses a left-invariant Haar system, briefly 
called a left Hoar system. For example, if G is a Lie group, u G Gq can only be the 
identity e G G, so that a left-invariant Haar system is the same as a left-invariant 
Haar measure on G (which exists on any locally compact group). Furthermore: 

1. If G = Go = M, where M is a manifold (as always), then s{x) =t{x) = x = 
and the condition (C.465) is empty, so that a left-invariant Haar system is just a 
smooth function jj. :M ^ (0,°°), i.e. /^(m) = /r“. In what follows, we simply take 
/r (m) = 1 for each uGM. More generally, whenever G“ is compact, we normalize 
a Haar system by imposing /r“(G“) = 1, as in the case of groups. 

2. For a pair groupoid G = M xM, on the other hand, (C.465) forces the system of 
measures to collapse to a single measure p on M, i.e., /r“ = /r for each u G M. 
For M = K", we take p. to be Lebesgue measure. 

3. For the tangent bundle G=TM (with fiberwise addition), which is essentially a 
bundle of abelian groups M”, eq. (C.465) forces each measure p" on 

t-\u) = f,M, (C.466) 

to be translation invariant. For M = K” (or, more generally, if LM is a trivial 
bundle), we take all p" to be the same and all equal to Lebesgue measure. 

4. For action groupoids F k M, we have t^^{u) = G, and any left-invariant Haar 
measure dy on F yields a left Haar system on G as dp‘‘ = dy, for each u G M. 

5. In case of a tangent groupoid G^, the f-fibers are indexed by (h,u), where h G 
[0,1] and u G M, so that a left Haar system consists of a family p^^’‘'\ It turns 
out that given any left Haar system {p'*,u G M) on G, there exists a (suitably 
normalized) left Haar system {Pq,u G M) on the vector bundle NmG such that 
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II 

So 

II 

(C.467) 

^(M) (/1>0), 

(C.468) 


where n = dim(G) — dim(M), defines a Haar system on G^; the extra factor 
in (C.468) is necessary and sufficient for this Haar system to satisfy the smooth¬ 
ness condition on (C.464). For example, if G = K” x K" is the pair groupoid on 
R”, where each fiber G“ = K" is endowed with Lebesgue measure c/”x, then the 
fibers K” of the vector bundle NmG = TM." should carry exactly the same mea¬ 
sure. To see this, in (C.464) we substitute G G^ and U {h,y) iy G M”), so 
that for each f {G^) the following function on [0,1] x R” should be smooth; 

(0,y) ^ [ d"vf{0,y,vy, (C.469) 

JR" 

{h,y) ^ f d^xf{h,x,y) {h > 0). (C.470) 

Jr" 

To interpret this condition, we put / = /o0^*, where /is smooth on [0, 1] x TR”, 
and 0^^ is given by (C.427) - (C.428). This transforms the above function into 

(0,y) ^ [ <J”v/(0,y,v); (C.471) 

Jr" 

{h,y)^ [ d"vf{h,y-\hv,v) {h>0). (C.472) 

Jr" 

We now define C*-algebras C*(G) and C*(G), which depend on the choice of 
a left Haar system on G, but different choices lead to isomorphic C*-algebras. We 
start from C~ (G), on which we define a convolution product and an involution by 

f*g{x) = / dll<’^\y)f{xy)g{y-^)- (C.473) 

r(x) =/(x-i). (C.474) 

We then define a C*-algebra C* (G) as the completion of C~(G) in the norm 

ll/ll =sup{||;r(/)||}, (C.475) 

where the supremum is over all Hilbert space representations of C~(G) that satisfy 

||;r(/)||<||/||i^max{||/||fM|/||W}, (C.476) 

where the canonical L'-norm on the right-hand side is defined by 

ll/ll= sup / dlJ.ui.y)\f{y)\-, ll/ll= sup / £/^“(y)|/(y)|. (C.477) 

uGM J Gu G^ 

A more tractable possibility is to limit these representations to a selected class, such 
as the following one. Further to the f-fiber (C.463), we denote the s-fibers of G by 
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Gu = s-\u), (C.478) 

which carries a canonical measure 

(C.479) 

This leads to Hilbert spaces 

Hu=L^{Gu,llu), (C.480) 

on which C“ (G) can be represented through the formula 

nu{f)w{x)= ( d^‘‘{y)fixy)wiy-') iW(^H„x€Gu,y€G‘‘). (C.481) 

Jg“ 

Such representations automatically satisfy the bound (C.476); restricting the repre¬ 
sentations n in (C.475) to these Ttu, u G M, gives the reduced groupoid C*-algebra 
C* (G). In other words, C* (G) is the completion of C~ (G) in the norm 

ll/ll, = sup{||7r„(/)||,«GM}. (C.482) 


One often has C*(G) = C*(G), but if G is for example a non-compact and semi¬ 
simple Lie group, then the two differ (in which case C* (G) is a quotient of C* (G)). 
Deferring groups to the next section, the other examples on our list are as follows. 

1. For a space G = M, the algebraic operations are 


f*g{x)= f{x)g{x)- (C.483) 

fix) = fix), (C.484) 

from which we obtain 

C;(M)=Co(M). (C.485) 

Indeed, G* = {x}, so with /r (x) = 1 for each x G M, we obtain 

= C; (C.486) 

T^xif) = fix), (C.487) 


and hence ||/||r = H/IU; the completion of C~(M) in this norm is Co(M). 

2. A pair groupoid G = M x M, with left Haar system f = p for all u G M, gives 


f*g{u,v) = f dpiw) fiu,w)giw,vy, (C.488) 

JM 

fiu,v)=Jff, (C.489) 

which of course is reminiscent of the corresponding operations on matrices. Also, 

=l2(m,ai); (C.490) 

^uifwiv) = [ dpiw)fiv,w)^iw), (C.491) 

JM 
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where we wrote x = (v, u) and y = {u, w), and identified yf{v) with \j/{v,u). With 
this identification, the representations 7r„ are the same for each u. Using the fact 
that C^{M X M) is dense in L^{M x M) and that integral operators (C.491) of 
Hilbert-Schmidt type are dense in the compact operators, we obtain 

c; (M X M) ^ Bo[L^ (M)). (C.492) 

3. For a tangent bundle G=TM,we have, identifying TuM with K", n = dim(M), 

f*g{u,v)=[ d"wf{u,v + w)g{u,-w); (C.493) 

JR" 

f{u,v)=f{v,-u), (C.494) 

where we used local coordinates (m,v) on TM. Furthermore, we have 

Hu = L^{TuM) = (C.495) 

^uif)wiv) = [ d"wf{u,v + w)\ir{-w), (C.496) 

Jr« 

which is diagonalized by a Fourier transform f ^ f (cf. Theorem C.109), with 

f{u,p)= [ d^v/Me'^r (C.497) 

Jm« 

This map therefore gives an isomorphism 

C*{TM) ^ Co{T*M). (C.498) 

4. The (reduced) C*-algebra of an action groupoid G = F t<M has operations 

f*giY,u)= [ d5f{Y5,u)g{5^',5^'Y^'u); (C.499) 

JG 

r(7,«) = 7(rM^, (C.500) 

and the special representations 7r„ are given by 

Hu = L^{G)- (C.501) 

T^u{f)w{y)= f d5f{Y5,Yu)wiS^^)- (C.502) 

JG 

This gives the (reduced) transformation group C*-algebra (see the end of §C. 18) 

c;{r KM) = c;{r,M). (c.503) 

5. The C*-algebra C*{G'^) of a tangent groupoid will be analyzed in §C.19. 
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C.18 Group C*-algebras and crossed product algebras 

It can be shown that in cases 1-3 above we have C* (G) = C* (G). It is useful to give 
a more direct and general construction of both C*(G) and C*(G) in the case where 
G is a group or an action groupoid; although the former is a special case of the latter 
by taking the trivial G-action on a point, we treat the group case separately first. 

Let G be a Lie group, or, more generally, a locally compact group, which for 
simplicity we assume to be unimodular (so that it has a left Haar measure dx that is 
also right invariant). We turn C“(G), or, more generally, Ce(G), into an algebra with 
involution by specializing (C.473) - (C.474) to groups, i.e. (changing y i—>■ 

f*g{x) = [ dyf{y)g{y-^xy, (C.504) 

JG 

r(x) =/(x-i). (C.505) 

Any unitary representation m of G on a Hilbert space H (assumed strongly continu¬ 
ous, as always) then gives rise to a representation u-f of this *-algebra by 

= [ dyf{x)u{x), (C.506) 

JG 

in that {f *g) = (.?) and if*) = if)*- Let 

ll/ll =sup{!!«/(/) Ill, (C.507) 


where the supremum is over all continuous unitary representations of G. 

Definition C.119. The group C*-algebra C*(G) of G is the closure ofCfiG) or 
Ce(G) in the norm (C.507). The reduced group C*-algebra C*(G) of G is the 
closure ofCfiG) or CdG) in the norm 

\\f\\r = \\4if)l (C.508) 

where ul is the left-regular representation udG) on H — L^(G), cf (7.52). 

The relationship between the two group C*-algebras is given by 

C;(G) ^ u[iC*iG)) ^ C*(G)/ker (u{j ■ (C.509) 

Definition C.120. A unitary representation ui is weakly contained inu 2 , if\\u{if)\\ 

r 

< ||m 2 (/)|| for all f € CdG). If every unitary representation of G is weakly con¬ 
tained in ui, and hence ker (^4^ ~ L* (G) = C* (G), we call G amenable. 

It can be shown that G is amenable iff the commutative C*-algebra CdG) of 
bounded continuous functions on G with sup-norm has a left-invariant state O), i.e., 

(oiLyf) = coif) (y G G,/g C,(G)). (C.510) 


T^txLtSiXLMtXLtjJCjaJ. T^lLy-A-LC-A. 



C.18 Group C*-algebras and crossed product algebras 


735 


Here Lyf{x) = f{y^^x) as usual. This is the case, for example, for all compact 
groups, all abelian groups, and all solvable groups (and semi-direct products thereof, 
like the Euclidean group). Non-compact semi-simple Lie groups, like 5L„(]R), or the 
Lorentz group, are not amenable, similarly for e.g. the Poincare group. 

Bij construction, there is a bijective correspondence u-f between unitary rep¬ 
resentation of G and non-degenerate representations of C* (G) (which restricts to a 
bijection between unitary representation of G that are weakly contained in ul and 
non-degenerate representations of C* (G)). In one direction, this is given by (C.506), 
whilst in the other, one first decomposes = p as a direct sum of cyclic represen¬ 
tations with cyclic vectors f2,, and then, for each in the sum, puts 

u{x)p{f)Qi = p{LJ)Qi. (C.511) 


Now take any C*-algebra A on which G acts, in that there is a continuous group 
homomorphism a : G ^ Aut(G), i.e., for each x € G we have an invertible ho¬ 
momorphism ax '■ A ^ A such that UxO ay = axy and = id^ (or, equivalently, 
= Ct^-i), and for each a GA, the function x CCx{A) from G to A is continuous. 
We turn the space Cc(G,A) into a *-algebra by generalizing (C.504) - (C.505) to 

f*g{x) = f dyf{y)ay{g{y^^x)y, (C.512) 

J G 

r(x) = a,(/(x-')*). (C.513) 

We construct representations of C(;(G,A) as a *-algebra from pairs {u{G),n{A)), 
where m is a unitary representation of G, and tt is a representation of A (both defined 
on the same Hilbert space H) that satisfy the covariance condition 


n{ax{a)) = u{x)n{a)u{x)*. (C.514) 

Writing 7t yiu-f for the associated representation of Ce(G,A), we put 


n'Au^if)= / dxn(f(x))u(x), 
JG 


(C.515) 


and define 

ll/ll = sup{||7rxM^(/)||}, (C.516) 

where the supremum runs over all pairs {u{G),n(A)) satisfying (C.515). The clo¬ 
sure C*{G,A, a) of Cc(G,A) in this norm is a C*-algebra called the crossed prod¬ 
uct or covariance algebra defined by G, A, and a. Once again, by construction 
there is a bijective correspondence (u, ;r) G)- TT xi between pairs (m, 7t) satisfy¬ 
ing (C.515) and non-degenerate representations n y\ = p of C*(G,A,a), in one 
direction given by (C.515), and in the other by 


M(x)p(/)f2,- = p{ax{Lxf))^li- (C.517) 

n{a)p{f)Qi = p{af)Qi. (C.518) 
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Here ax{Lxf) € Cc(G,A) is the function y i— >■ ax(f(x^^y)), similarly af G Cc(G,A) 
is given by y i-G af{y), and the cyclic vectors are defined as in (C.511). 

To construct a reduced crossed product, we take any injective representation 
nr{A) on some Hilbert space K, and from it construct a new Hilbert space 

H = L^{G,K)'^L^{G)®K, (C.519) 


consisting of all measure functions t// : G —> for which y{^)\\k < 


{(p,W)= / dx{(p{x),\lfix))f 

JG 


(C.520) 


as the inner product. This Hilbert space H carries a covariant pair {u{G),n{A)), viz. 

(C.521) 

n{a)\i/{x) = nr{a^-i{a))\i/{x), (C.522) 

and hence an associated representation 7t yiu-f of Cc{G,A) given by (C.515), which 
by continuity extends to a representation of C* (G,A, a). As in the group case, we 
define C*{G,A,a) as the closure of Cc{G,A) in the norm ||/||;. = ||P/'(/)||, or as 


c;(G,A, a) = Pr{C* (G,A, a)). (C.523) 


If G is amenable, we once again have C* {G,A, a) = C*(G,A, a), as for C* (G). 

The main case of interest to us is given by a group action G O 2, as above, which 
gives rise to a crossed product C*(G,Co(2), a) = C*(G,2) through the choices 


A = Co (2); (C.524) 

axif) = Lxf, (C.525) 

i.e., ax{f){q) = f{x^^q). The (reduced) crossed product Cj^j(G,2)5 then, is the 
same as the (reduced) C*-algebra of the action groupoid G k 2- Identifying the 
spaces Cc(G x Q) and Cc(G,Cc(2))> CQS. (C.512) - (C.513) now become 

f*g{x,q) = f dyf{y,q)g{y^^x,y^^q); (C.526) 

JG 

rix,q)=f{x-\x-^q). (C.527) 


The obvious candidate for a faithful representation of Co (2) comes from a measure 
V on 2 with support Q, so that we may take K = I?{Q,v) and 7lr{f) = m^, i..e, 
^r{f )W = f G C’o(2)- Identifying L^{G)®L^{Q) with L^(G x Q), this yields 

u{y)\l/{x,q) = \f/{y^^x,q)-, (C.528) 

^{f)wix,q) = fix^^q)wix,q); (C.529) 

Pr{f)w{x,q) = f dyf{y,xq)\l/(y^^x,q). (C.530) 

JG 
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C.19 Continuous bundles of C*-algebras 

As shown on Chapter 7, continuous bundles of C*-algebras form a mathematical 
bridge between the classical and the quantum worlds, but they also form a beautiful 
structure in their own right. In what follows, I is an arbitrary locally compact Haus- 
dorff space, but in the main text it is a subset of the unit interval [0,1] that always 
contains 0 as an accumulation point, so one may have e.g. I = [0,1] itself, or 

/=(1/N)U{0} = 1/N, (C.531) 

where N = {1,2,...}). In physics, I plays the role of the value set for Planck’s con¬ 
stant, but also below we generically write /i G /, if only to avoid notational confusion 
with X G A (as Co(A) will be a typical fiber of the continuous bundles we study). 

Definition C.121. Let I be a locally compact Hausdorff space. A continuous bun¬ 
dle of C*-algebras over I consists of a C*-algebra A, a collection of C*-algebras 
{Afi)ti^j, and surjective homomorphisms (pn'. A ^ Apifor each h G I, such that: 

1. The function h i-G ||^f,(a)||r, is in Cq{I) for each a G A. 

2. Writing || • \\hfor the norm in Af,, the norm of any a G A is given by 

||fl|| = sup||^fi(a)||s. (C.532) 

hei 

3. For any f G Co(/) and a G A, there is an element fa G A such that for each h G I, 

<Pn{fa)=f{h)<pn{a). (C.533) 

A continuous (cross-) section of the bundle in question is a map hi-G a{h) G Ap^, 
h G I, for which there is an a G A such that a{h) = (ppi (a) for each h G I. 

Thus A may be identified with the space of continuous sections of the bundle: if we 
do so, the homomorphism (pp, is just the evaluation map at h. The structure of A as 
a C*-algebra then corresponds to pointwise operations on sections. The idea is that 
the family of C*-algebras is glued together by specifying a topology on the 

disjoint union Lip^jAfi, seen as a fibre bundle over I. However, this topology is in 
fact given rather indirectly, namely via the specification of the space of continuous 
sections. This is reminiscent of Theorem C.23, which specifies the topology on a 
locally compact Hausdorff space X via the C*-algebra Co(A). More generally (the 
previous case being the trivial vector bundle E = X x C), the Serre-Swan Theorem 
about fiber bundles allows one to reconstruct the topology on a locally trivial vector 
bundle £ A A from the (finitely generated projective) Co (A)-module Co{X,E) of 
continuous sections of £. As in Definition C.121, one has maps <Px Cq{X,E) ^ E^ 
given by evaluation at x, so that (C.533) holds. However, continuous bundles of 
C*-algebras need not be locally trivial, for us, this is even the whole point! 

Another way of looking at continuous bundles of C*-algebras starts from a non¬ 
degenerate homomorphism (p from Co(/) to the center Z{M{A)) of the multiplier 
algebra M(A) of A (see §C.10); we simply write fa for (p{f)a, and similarly Cq{I)A. 
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In this notation, nondegeneracy means that Co{I)A is dense in A. Given such a non¬ 
degenerate homomorphism (p : Co{I) — Z{M{A)), one may define fiber algebras by 

An=A/{Co{I-,h)-Ay, (C.534) 

Co{I-n) = {/ G Co(/) I f{n) = 0}; (C.535) 

since - A is an ideal in A, the quotient A^ is a C*-algebra. The projections 

(pn '■ A ^ Afi are then given by the corresponding quotient maps sending a G A to 
its equivalence class in A^. In general, the function h i—> ||^s(fl)||s is merely upper 
semicontinuous, so that one only obtains a structure equivalent to the one described 
in Definition C.121 if one explicitly requires the above function to be in Co(/), in 
which case clause 2 of Definition C.121 follows, too. 

It is easy to find “trivial” examples of continuous bundles of C*-algebras: fix 
some C*-algebra B and take A = Co{I,B) with pointwise operations. In that case, 
Afi = B for each G /, and the map (pf^: A B is given by =a{h). 

It is not so easy to find nontrivial examples, even with isomorphic fibers (these 
were first given by Dixmier and Douady, who took the fiber algebras to be the 
compact operators Bq{H)). To connect classical to quantum, we need bundles over 
I C [0,1] as described above, with non-isomorphic fibers, of which the fiber Aq 
above = 0 is isomorphic to Co{X) for some (locally compact) phase space X, 
and hence is commutative, whereas all other fibers are noncommutative. One might 
say that it is the job of (deformation) quantization theory to construct such fields. 
Without proof, we now describe the main class of examples relevant to physics. 

As we have seen, each Lie groupoid G canonically defines an associated C*- 
algebra C* (G), in which C~ functions on G endowed with a generalized convolution 
product (C.473) and involution (C.474) form a dense subspace. In particular, 

C;{TM) ^ Co{T*M); (C.536) 

C;(M xM)^ Bo{L^{M)), (C.537) 

where M is a manifold (without boundary) with tangent bundle TM and cotangent 
bundle T*M. More generally, for any given Lie groupoid G one may define 

Ao = C*{NmG) (fi = 0); (C.538) 

As =C;(G) (fi>0), (C.539) 

where NmG is the normal bundle to the embedding M ^ G, cf. (C.444). Now con¬ 
sider the tangent groupoid G^, which is a bundle over [0,1] with fibers 

Go = A(wG (fi = 0); (C.540) 

Gl = G {h> 0), (C.541) 

The interplay between the differential geometry of the tangent groupoid and the 
notion of (reduced) Lie groupoid C*-algebras is described by the following lemma. 
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Lemma C.122. The map C'^{G^) —>■ C~(G^) that restricts f to C G^ continu¬ 
ously extends to a surjective homomorphism (pp,: C* (G^) —>■ C* (G^), /i G [0,1]. 

Various special cases and this lemma ultimately led to the key result of the 1990s: 

Theorem C.123. For any Lie groupoid G, the fibers (C.538) - (C.539) merge into 
a continuous bundle of C*-algebras over I = [0,1] with total algebra A = C* (G^) 
and homomorphisms (pp, : A ^ Ap^ as described in Lemma C.122. 

The same result holds for the full groupoid C*-algebras C*{G'^) and C*{G^). 

For the pair groupoid G = M” x M", as in the argument (C.469) - (C.472) we take 
some / G C~(7’R”), seen as a function / G C°°([0,1] x TW) that is independent of 
h. This yields a function / o G C~(G^), and by construction, 

/o</)-i(0,x,v) =/(x,v); (C.542) 

{h>0). (C.543) 

By lemma C.122, the function (p(){f o0^*) is an element of 

Ao = C;{TW); (C.544) 

this element is just the function /. For h> 0, we see ^s(/o as an element of 

As^Bo(L^(R")), (C.545) 

through (C.490) - (C.491). Calling this element Q '^(/), we have 

Qj{f)w{x)=h-'‘J^J'‘yf V^(y). (C.546) 

We now use the isomorphism (C.536), implemented through the Fourier transform 

f{x,p) = I d%f{x,v)e'P'’- (C.547) 

JR" 

= L (C.548) 

Hence as an element of Co(7’*R”), the operator (P(){fo 0^') is /. From this perspec¬ 
tive, using (C.548), eq. (C.546) may be rewritten in the more familiar form 

Q^{f)w{x)= [ ^^.'>(^-^)/V(y)/(^(x+y),p). (C.549) 

Jr*M" (2nh)" 

It follows that any / G C]^(rK") defines a continuous cross-section of the continu¬ 
ous bundle of C*-algebra defined by A = C* ((K” x given by (C.547), and 

0 /GCo(7’*R"); (C.550) 

fi^er(/)GBo(c"(R”)). (C.551) 
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See also §7.1. These formulae were written down for the special case M = M", but 
similar results (based on the exponential map as defined in Riemannian geometry) 
apply to any manifold. Moreover, as explained in §§7.2-7.4, Mackey’s theory of 
quantization based on systems of imprimitivity and induced group representations 
falls squarely under the above umbrella, where G is an action groupoid. 

We also employ continuous bundles of C*-algebras with non-isomorphic fibers 
even away from h — Q. The construction of these fields relies on the following result, 
which is a special case of a more general claim; we just state the case we need, in 
which I = 1/N; continuity then imposes conditions alh — Q only (as I is discrete 
elsewhere). We identify the total space A of a (continuous) bundle of C*-algebras 
with the space of its (continuous) sections, as explained at the beginning of this 
section; thus a G A C WnAn takes the form a = {flsjfiG/, an G An- 

Proposition C.124. Suppose one has a family {An}nei of C*-algebras over I = 1 /N, 
as well as a subset A C Yln^h that satisfies the following conditions: 

1. The set {dn | a € A} is dense in An for each h G I. 

2. One has limAr^oo ||di/Ar|| = ||flo|| for each a G A; 

3. The set A is a *-algebra (under pointwise operations). 

Let A consist of all a G Yln^hfor which one has 

^im ||ai/^ — = ||ao — doll (oGA). (C.552) 

Regard A as a C*-algebra under pointwise operations and norm (C.532), and define 

(pn{a)=an. (C.553) 

Then (A, {An, (pn}nei) a continuous bundle of C*-algebras (and is the unique such 
bundle whose space of sections contains A). 

The proof relies on the following lemma (which we state for general compact I). 

Lemma C.125. The total C*-algebra A of (sections of) a continuous bundle of C*- 
algebras is locally uniformly closed. That is, if a G Y\ti^ti such that for every 
ho G I and every e > 0, there exists G A and a neighborhood of ho in which 
\\an — b^nW h G c/k, then a gA. 

Equivalently, if A (etc.) is a continuous bundle of C*-algebras, and a G Yln^h is 
such that the function h i—>■ \\an — bn\\ lies in C{I) for each b G A, then a G A. 

Proof. Since I is compact, it has a finite cover {U \,..., t/„} with associated partition 
of unity {«,}. With a and e as in the lemma, take hi G Ui and b^‘ also as in the lemma, 
and define b = Y,iUib^‘. Then h satisfies sup^gj ||as <£, and also bGA, because 
of Definition C.121.3. Hence a G A by Definition C.121.2 and completeness of A. 

As to the equivalent version, given a G WnAn and ho G /, because is surjective, 
there is a b^° G A such that a/j^ — bf°. The assumption in the second part then implies 
that the conditions in the first part are satisfied, such that a gA. □ 

We are now in a position to prove Proposition C.124. 
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Proof. We first show that A as defined in the proposition is locally uniformly closed. 
With the notation of Lemma C.125 and its proof, take 5 G A, and define the functions 

faa'-ti^Wari-ariW', (C.554) 

fba:n^\\bl°-an\\. (C.555) 

Since |(||X|| — ||F||)| < ||X — F||, one obtains 

\faa{h)-fbam<£, (C.556) 

for all h G I.By assumption, is continuous, so that 

<e, (C.557) 

for all h in some neighborhood U' of ho. Combining the two inequalities yields 

\faa{n)-faa{no)\<3e, (C.558) 

for all h G U'. Hence faa is continuous at any ho G /, so that a G A by Lemma C. 125. 

Using this property, it is easily shown that A is a C*-algebra, and that condition 3 
in Definition C.121 is satisfied. It is clear from Definition C.121.1 and the definition 
of A in the proposition that A is maximal. On the other hand, according to the second 
part of Lemma C.125, A is minimal, so that it is unique. □ 

To close, let us explain to what extent we can say that a given section (fli/Ar)Ar of 
either one of our continuous bundles A^'^^ or aW 

“converges” to its value qq. 

Proposition C. 126. Let (ao,fli/Ar) and be continuous cross-sections of 

some continuous bundle A of C*-algebras over 7=1 /N, such that 

l|fli/Ar-fli/wll =0. (C.559) 

Then Gq = aq. In particular, (/(aojAj/^) is a continuous cross-section, then aq is 
uniquely determined by the (aj/^) and we may symbolically write 

Ao = lim Aw^. (C.560) 

Af-s-oo ' 

Proof. The last part of Lemma C. 125 states that the function defined by 

0 I—>■ ||ao —aqII; 

1/77 i-G ||ai/^ — 

is continuous on 1 /N (i.e., continuous at 0). □ 
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C.20 von Neumann algebras and the cr-weak topology 

In this section and in §C.24 we turn to special classes of C*-algebras that are occa¬ 
sionally used in quantum (field) theory. Since the arguments tend to become very 
lengthy and technical, we will only prove some key results (e.g. von Neumann’s 
Double Commutant Theorem), and mention other results without proof (references 
to which may be found in the Notes). This also applies to the next four sections. 

The subject of operator algebras historically started with what we now call von 
Neumann algebras, in honour of the founder of the subject (although, curiously, 
C*-algebras are not called “Gelfand-Naimark algebras”; perhaps they should!). 

The first result in operator algebras was and is the Double Commutant Theorem: 

Theorem C.127. Let M be a unital *-subalgebra ofB{H). Then the following con¬ 
ditions are equivalent — and, if satisfied, define M to be a von Neumann algebra.' 

(i) M" =M; 

(ii) M is closed in the weak operator topology; 

(Hi) M is closed in the strong operator topology. 

Recall that the commutant S' of any S C B{H) is defined by 

S' = {a£B{H)\ab = ba\tb€S}, (C.561) 

and that the bicommutant of S is S" = {S')'. If S* = S, in that a S 5 iff a* G S, then 
S' is easily seen to be a unital *-algebra within B{H). Furthermore, it is obvious that 
S C S", so that the passage S i— S" is some sort of a closure operation within B{H), 
comparable to the closure operation L within H itself. Theorem C.127 shows 

that if 5 is a unital *-algebra, the algebraic closure operation S i— S" coincides with 
two topological closure operations. To this effect, recall also that: 

• The weak operator topology on B{H) may be defined by saying that ax ^ a 
(where is some net in M) iff {g), {ax — a)\jf) ^ 0 for all G H; 

• The strong operator topology on B{H) yields convergence a of some net 
{ax) iff \\{ax —a)\if\\ —>• 0 for each xp G H. 

Proof The essence of the proof is already contained in the finite-dimensional case 
H = C", where the nontrivial claim in Theorem C.127 is: 

IfM is a unital *-subalgebra ofMn{C), then M" = M. 

In fact, all we need to prove is M" C M, since the converse inclusion is obvious. 
The idea is to take n arbitrary (and hence possibly linearly independent) vectors 
Vi,...,Vn in H, and, given a G M", find some b G M such that an,- = bVi for all 
i= Hence a = b, so a G M. To this end, we start with a single vector v G H. 

Form the linear subspace Mv = {mv \ m G M} of H, with associated projection e 
(i.e. ew = wifwG Mv and ew = 0 if w G {Mv)^). Then e G M', and hence a G M" 
commutes with e. Since 1// G M, we have v G Mv, so v = ev, and we compute 
av = aev = eav G Mv. Hence av — bv, for some b GM. 

Now run the same argument with the following substitutions: 
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• (with n terms). 

• M M„ = {diag(m,... ,m) I m G M}. 

• t)-^v = ©,t), = (t;i,...,t)„). 

We then have (M„)" = so for any matrix a = diag(a, ...,a) in the 

previous argument yields a matrix b = diag(fo,... ,fe) G M„ such that ai) = bi). But 
this is aVi = bVi for all / = 1,..., n, so that a = b and hence M” C M. 

If H is infinite-dimensional, the above proof may be adapted by taking the closure 
of Mv in H, which gives (3) (1). Finally, (1) (2) (3) is trivial. □ 

Corollary C.128. Let M be a unital * -subalgebra ofB{H). Then the closures ofM 
in the strong and weak topologies coincide with each other and with M". 

Corollary C.129. A von Neumann algebra is norm-closed, i.e., is a C*-algebra. 

Since S'” = S', the commutant of any self-adjoint set S* = S C B(H) is a von 
Neumann algebra. As a case in point, take a (strongly continuous) unitary group 
representation u : G ^ B{H). Then u{x)* = so u{G)' is a von Neumann 

algebra. In fact, any von Neumann algebra M takes this form, since one may take G 
to be the group of all unitaries in M (and u its defining representation). Furthermore, 
the bicommutant A” of any C*-algebra A C B{H) is a von Neumann algebra. An 
important example of this construction is the abelian von Neumann algebra W* (a) = 
C*(fl)" generated by a self-adjoint operator a = a* G B{H), cf. (B.320). 

Although the weak and strong topologies on M appear in the fundamental dou¬ 
ble commutant theorem, the most important topology on a von Neumann algebra 
(besides the norm topology) is the so-called the a-weak topology (sometimes called 
the ultraweak topology). This topology corresponds to the following convergence: 

• One has a^ ^ a (7-weakly iff Tr {b{ax — a)) —0 for each b G Bi (H). 

To begin with, as far as Theorem C.127 is concerned this topology is at least on a 
par with the weak and the strong ones: 

Theorem C.130. Let M be a unital * -subalgebra of B{H). Then M" = M (i.e. M is 
a von Neumann algebra) iffM is closed in the (7-weak operator topology. 

This one is a bit more technical, so we just sketch the proof. 

Proof. Define a new Hilbert space H°° — whose elements v are infinite se¬ 

quences of vectors (ui, U 2 ,...) in // with Lt IP < The inner product is 

(C.562) 

i 

The obvious (diagonal) embedding of B{H) in B{H°°), whose image is denoted by 
B{H)oo, restricts toM C B{H), with image Moc C B{H°°). Then the (7-weak topology 
on B{H) is the relative weak topology on B{H)oo (i.e., the weak topology on B{H°°) 
restricted to B{H)oo), so that Theorem C.130 follows from Theorem C.127. □ 

This brings us to an important refinement of Theorem C.127, called Kaplansky’s 
Density Theorem (which should actually be seen as a lemma for numerous results): 
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Theorem C.131. Let A C B{H) be a C*-algebra (or a * -algebra). Then the unit ball 
of A is dense in the unit ball of A" in the weak, strong, and O-weak topologies. 

The real significance of the a-weak topology comes from Sakai’s Theorem: 

Theorem C.132. A C*-algebra M C B{H) is a von Neumann algebra iff M is the 
(Banach) dual of a unique Banach space M* (called the predual ofM). 

We turn to the proof below. For example, by Theorem B.146, the predual of B{H) is 

B{H),,'^Bi{H). (C.563) 

In the commutative case, entry 10 in Table B.l in §B.9 gives 

L°°(X,p),^L'(X,py, (C.564) 

the fact that L°°{X,ii), acting on H = if {X,p) as multiplication operators, is a von 
Neumann algebra was established in §B.16. In the first example, the CJ-weak topol¬ 
ogy on B{H) obviously coincides with the weak*-topology defined by B{H)^:. 

In general, there is a canonical embedding M* ^ M*, (p (p, with (p(a) = a(^), 
cf. §B.9. Proposition B.46 then shows that the image ofM* inM* consists precisely 
of the weak*-continuous functionals on M (recall that the weak*-topology on M is 
the topology of pointwise convergence, seeing M as the dual of M*). If we now iden¬ 
tify ^ with (p, we have the following generalization of the observation just made: 

Theorem C.133. Let M C B{H) be a von Neumann algebra. The predual M* of 
M (seen as a subspace of M*) coincides with the space of u-weakly continuous 
functionals on M, and hence the O-weak topology on M coincides with the weak*- 
topology in its role as the dual Banach space ofM^. 

(j-weakly continuous functionals on a von Neumann algebra M are called normal. 
Proof. Identifying ^ with <p, we introduce the following spaces: 

= {(p& B{H),, I (p{a) = OVa € M}; 

= {fl G B{H) I (p{a) = QWtp G M-^}. 

Having proved the theorem for M = B(H), i.e., (C.563), the key is to show that 

M-L-L = M; (C.565) 

M* B{H)„/M^, (C.566) 

where (C.566) denotes an isometric isomorphism of normed spaces. Since the right- 
hand side of (C.566) is a Banach space, so is the left-hand side. This yields the first 
claim. Combining (C.566) with (C.565) and the duality B(H) = B\ (//)*, we have 

Ml ^ =M. 

This is the second claim. The first equality sign is true, because if T is a closed 
subspace of a Banach space Y, then (X/Y)* = {<p &X* | ^ f T = 0}- 
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For the remainder of the theorem, recall that ax ^ a CJ-weakly in M whenever 
(p{ax — a) ^ 0 for all (p G By (C.566), this is equivalent to a;^ —?► a in the 

weak*-topology, since a possible component of (p in drops out. 

We next prove (C.565). The inclusion M C is trivial. For the converse, pick 
a^M', since M is a von Neumann algebra, it is (7-weakly closed, so its complement 
in B{H) is O’ -weakly open. Hence there are ^ G B{H)tf and e > 0 such that the 
open neighbourhood ^(a) = {b G B{H) ; \(p{a) — (p{b)\ < e} of a entirely lies in 
M‘^. So \(p{a) — (p{b) \ > e for all b G M. This implies (p{b) =0 by linearity in b. 
Hence |^(a)| > e, so a ^ hence M-^-^ C M. 

For (C.566), first note that is a norm-closed subspace of B{H)^: = B\{H), 
which is a Banach space in the trace-norm (which coincides with the norm inherited 
from B(//)*, since the injection Z?i [H] ^ B{H)* is an isometry). Hence the quotient 
B{H)t,/M^ is a Banach space in the canonical norm ||^|| =inf{||^ +V/|| | i// GM^}, 
where (p is the image of ^ G under the canonical projection, and the norm is 

the one in B{H)*. Let (p^ = (p f M be the restriction of ^ G B{H)^ to M. It is clear 
that the map i—^ ^ is well defined and is a linear bijection fromM* to B{H)t/M^. 
In fact, this map is isometric. First, one trivially has 

= sup{|^(a)| I a GM„} = inf sup{|^(fl) + V/(a)| | a G M„}, (C.567) 

i/GM-L 

since (/(a) = 0. But this is clearly majorized by 

11^11= inf sup{|^(fl) + V/(fl)|,fl G (C.568) 

I/GM-L 

since now the supremum is taken over a larger set. Hence < ||0||. 

Conversely, for any (p G B{H)^: with ||^|| = 1, by Corollary B.41 there exists 
an a G B{H) with a G <p(a) = 1 and ||fl|| = 1. From (C.565), one then has 

> \^{o)\ = 1 = ||^||. This finishes the proof of Theorem C.133. □ 

Half of Theorem C.132 evidently follows from Theorem C.133. The converse 
(‘if’) implication uses a refinement of the GNS-construction, where the state (O is as¬ 
sumed to be (7-weakly continuous. In that case, using the theory of CJ-weakly closed 
ideals of von Neumann algebras, it can be shown that 7t(o{M) coincides with n(o{M)” 
and hence is a von Neumann algebra. Since normal pure state on a von Neumann 
algebra may not exist (for example, take M ~ L°°{Q, 1)), the ‘crazy’ Hilbert space He 
in the proof of Theorem C.87 must be replaced by the perhaps even crazier direct 
sum Hec = ®meS„{M)Hoi), where this time the sum is over all normal states on M. 
Similarly, in Lemma C.15 one should now have a normal state instead of a pure 
state. Otherwise, the proof that M has a faithful representation as a von Neumann 
algebra on a Hilbert space essentially follows the proof of Theorem C.87. 

Finally, uniqueness of the predual follows from Corollary C.139 below. □ 

Corollary C.134. Let M C B{H) be a von Neumann algebra. Each normal func¬ 
tional (p G M* on M is of the form (p{a) = Tr [bo), for some b GB\(H). In particular, 
(p is a normal state ijfb is a density operator. 
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C.21 Projections in von Neumann algebras 

General C*-algebras need not have any nontrivial projections; think of Co([0,1]). 
On the other hand, von Neumann algebras are generated by their projections: 

Theorem C.135. Let 3^{M) = {p & M \ = e* = e}, where M is a von Neumann 

algebra. Then M is the norm-closure of the linear span of and M = . 

This is Corollary B.105. In addition, ^{M) is not just a set. 

Proposition C.136. The set of projections in a von Neumann algebra M is a 

complete lattice under the partial ordering e < f iff ef = fe = e. 

Proof Since e < / in M C B{H) iff eH C fH, the supremum e V / is the projection 
on eH + fH, whilst the infimum eA / is the projection on eHCifH. For arbitrary 
families {ex)xeA of projections, Mxex equals the projection on the closure of the 
linear span of all subspaces Hx = exH, whereas Axex = e is the projection on their 
intersection. To show that the latter lies in M (provided all the ex do, of course), note 
that each unitary u GM' satisfies uHx =Hx for all X, so that also u(r\xHx) = LixXIx- 
Hence eu = ue and so e G M” = M (since each element of a von Neumann algebra 
is a linear combination of at most four unitaries in it; the proof is similar to Lemma 

B. 145). Finally, by de Morgan’s Law we have Vx^x = with /^ = 1 — / 

for any / G Hence also G M. □ 

This is nice in itself, but is also implies a very important result about maps between 
von Neumann algebras. Recall that a (purely algebraic) isomorphism between C*- 
algebras (seen as *-algebras) is automatically isometric and hence norm-continuous; 
see Theorem C.62. An even better result holds for von Neumann algebras: 

Theorem C.137. A (purely algebraic) isomorphism (p : M ^ N between von Neu¬ 
mann algebras (seen as *-algebras) is an isomorphism of Banach spaces as well as 
a homeomorphism with respect the O-weak topologies on M and N. 

This theorem only seems to have rather difficult proofs. One, based on Proposition 

C. 136, is based on the following result. First, we say that a map (p : M N of von 
Neumann algebras is completely additive if for any family (ex) in 7^(M), 

(p('^xex) = '^x(p(ex)- (C.569) 

Lemma C.138. Let <p : M ^ N be a homomorphism of von Neumann algebras. 

1. (p is <y-weakly continuous iff it is completely additive. 

2. If (p is a (purely algebraic) isomorphism, then it is completely additive. 

The proof of claim 2 is easy, as is the implication from (7-weak continuity to com¬ 
pletely additivity in claim 1. The converse implication, however, is quite difficult. 
In any case. Theorem C.137 now follows, so that we may speak of isomorphisms 
between von Neumann algebras without any ambiguity. 
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Corollary C.139. If two von Neumann algebras are algebraically isomorphic, then 
their preduals M* and are isomorphic as Banach spaces. In particular (take M = 
N), the predual of a von Neumann algebra is unique (up to isometric isomorphism). 

A second proof of Theorem C.137 uses Theorem C.132 (and hence provides no 
non-circular proof of Corollary C.139), as follows. 

Proof. Since (p is isometric by Corollary C.129 and Theorem C.62, it induces a 
dual isomorphism (of Banach spaces) (p* : N* ^ M*, with the property that M = 
{(p*{Nt))* under the map 

ai-^ {(p*{(0) (0{(p{a))) (a&M,(0&Nf). (C.570) 

Uniqueness of the predual then yields <p*{Nf) = M*, which in turn implies that 
(p preserves pointwise convergent nets: if cofa^) (o'{a) for all (o' G M*, then 
(o{(p{ai)) -4 (o{(p{a)) for all (O G N^.. Hence (p is (7-weakly continuous. □ 

Theorem C.137 shows that the notion of isomorphism to be used in the classifi¬ 
cation of von Neumann algebras M is unambiguous. There are two totally different 
cases of von Neumann algebras (only M = C falls in both classes): 

• Abelian von Neumann algebras, which equal their center (MOM' = Mf, 

• Factors, which have trivial center (MOM' = C• 1). 

A factor has no nontrivial decomposition M — Mi ©M 2 , whereas an abelian von 
Neumann algebra (except M = C) does have such a decomposition (typically even 
many of them). Using von Neumann’s technique of direct integrals, which gener¬ 
alizes direct sums (and will not be reviewed here), the classification of general von 
Neumann algebras may be reduced to these two cases. We start with the first class. 

We know that if {X,E,IJ.) is some a-finite Borel space with associated Hilbert 
space L^{X,ix), then the commutative C*-algebra L°°(A,/4) is mapped isometrically 
into B{L^{X via/i— mf, see Proposition B.73 and especially (B.240). If we de¬ 
note the image of this mapby L“’(A,/4) also, then L°° (X, jj.)" =LE’{X,p.)hy (B.346), 
so U°{X,p) C B(L^(X,ju)) is an abelian von Neumann algebra. In general: 

Theorem C.140. LetM C B(II) be an abelian von Neumann algebra. Then 


M^L-{X,p), 


(C.571) 


for some locally compact space X and probability measure jJ. on X. 

If H is separable, this follows from Theorems B.l 16 (including the remarks after its 
proof) and B.l 17 in §B.16. The proof for arbitrary Hilbert space is quite technical 
and will be omitted, but the idea is to find an abelian C*-algebra A for which M = 
A", upon which X = 2^(A), and the measure p is constructed such that p{A) = 0 
iff P\f/{A) = 0 for all unit vectors xp G H, with defined similarly to (B.304). 
In general, one cannot take A = M, since E{M) may not support such measures. 
Thus we have a complete and satisfactory characterization of abelian von Neumann 
algebras, including their projections: these are simply the (equivalence classes of) 
characteristic functions 1 a, where A G Z is a Borel set in X (modulo null sets). 
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The advantage of this approach is that there are often simple models for X ; we 
know from the classification of maximal abelian von Neumann algebras on separa¬ 
ble Hilbert space in §B.17 that X = [0,1] with (Lebesgue measure) and X = N (with 
counting measure) are enough in that case. However, the pair (7f,/r) lacks intrinsic 
uniqueness properties. Thus it also makes sense to apply Theorem C.8 to abelian 
von Neumann algebras, so that M = C{X). Since by Theorem C.135, M has plenty 
of projections, which as elements of C{X) are realized by characteristic functions 
1^, where A CX, the space X must have lots of clopen (i.e. closed and open) sets. 

It can be shown that X arises as the Gelfand spectrum of some abelian von Neu¬ 
mann algebra iff it is hyperstonean, where we say that a compact Hausdorff X is: 

• Stone if the only connected subsets are points (equivalently, a Stone space is 
compact. To, and has a basis of clopen sets). 

• stonean if it is Stone and the closure of each open set is open. 

• hyperstonean if it is stonean, and for any nonzero / G C(X,R^) there exists a 
completely additive positive measure p such that p (/) > 0. 

This replaces the classification of abelian von Neumann algebras up to isomorphism 
by the classification of hyperstonean spaces up to homeomorphism, which is hardly 
an improvement (the only other area of mathematics where such wacky spaces ap¬ 
pear is algebraic logic). However, we do obtain a nice relationship between the 
projection lattice of an abelian von Neumann algebra and its Gelfand spectrum (at 
this point please recall Theorem D.5 and surrounding text in Appendix D). 

Theorem C.141. The projection lattice of a von Neumann algebra M is 

Boolean ijfM is abelian, in which case there is a homeomorphism 

(C.572) 

between the Gelfand spectrum of M (as a commutative C*-algebra) and the Stone 
spectrum of t^{M) (as a Boolean lattice). Hence we have isomorphisms 

M C(y’(^{M))); (C.573) 

^{X(M)) ^ Idl(,:^(M)), (C.574) 

as (commutative) C*-algebras and as frames, respectively. 

Proof. In the commutative case, the lattice operations in tf^(M) are given by 

eAf = ef- (C.575) 

eyf = e + f-ef- (C.576) 

e^ = lM-e, (C.577) 

as may be verified by embedding M C B{H) and using the proof of Proposition 
C.136; eq. (C.577) is true for any M. One then finds that M is distributive, since 

e /\(fy g) = ef + eg - efg = (e A f)y [e Ag), (C.578) 
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and similarly with V and A swapped. Since ^{M) is orthomodular for arbitrary 
von Neumann algebras M and is distributive if M is abelian, it follows that 3^{M) 
is Boolean. Conversely, if !^{M) is Boolean, we may compute 

{eA{eAf)^)^ = {eA{e^yf^))^ = {{e Ae^)y {e = {eAf^)^ =e^y f, 

and since / < g V / for any g, this implies / < (e A (e A/)^)^. Now f <g^ implies 
fg = gf = 0, so 

fie A (e A /)^) = (e A (e A /)^)/ = 0. (C.579) 

Ifthen e Ag = g, hence e Ag^ + g = e A {1 m — g) + g = e. So g = e A /gives 

e-(eA/) =eA(eA/)-^. (C.580) 

Using (C.579) - (C.580) finally yields 

ef=iieAf)+e-ieAf))f=ieAf)f+ieAieAf)^)f = eAf, (C.581) 

and since eAf = f Ae, we find ef = fe for any two projections e,f G 3^{M). Hence 
M is abelian by Theorem C.135. 

If we now realize the Gelfand spectrum Z [M] as the multiplicative state space of 
M, and realize the Stone spectrum S^{3^{M)) as the space Pt(,^(M)) of points of 
3^{M), then a homeomorphism Z(M) = Pt(,^(M)) arises as follows: 

• First, the restriction (p : 3^{M) —>■ C of any multiplicative state (p : M ^ C must 

be {0, l}-valued. Using (C.575) - (C.577), it is then easy to show that the ensuing 
map (p : 3^{M) {0,1} is a homomorphism of Boolean lattices. 

• Vice versa, by Corollary B. 104 a point (p : 3^{M) -G {0,1} extends by continuity 
to a map (p :M ^C. Since (p must preserve _L, this map is nonzero. By continuity, 
multiplicativity in general follows from multiplicativity on projections, which 
follows by running the previous point backward (or from Theorem C.168). 

Finally, (C.573) and (C.574) follow from (C.572) and the Gelfand isomorphism 
(Theorem C.8) and eq. (D.35), respectively. See also Theorem C.168 below. □ 

Note that (C.574) is a special case of Corollary C.84, for if M is a commutative 
von Neumann algebra, and H{M) its frame of heriditary subalgebras, we have 

H{M) ^ Idl(,:^(M)); (C.582) 

J {eG^{M)\MeQJ}, (C.583) 

whose inverse maps an ideal I C 3^{M) to the norm-closure of Ii^ 

particular, if J is (7-weakly closed, then J — Me for a unique projection e G 
in which case the right-hand side of (C.583) is just the principal ideal j, e. To see 
this special case, we quote a useful result about arbitrary von Neumann algebras: 

Proposition C.142. Let I be a <y-weakly closed left (right) ideal in a von Neumann 
algebra M. Then there is a unique projection e G ^(M) such that I — Me (I = eM). 

Indeed, e is the a-weak limit of any approximate identity in I. 
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C.22 The Murray-von Neumann classification of factors 

After this analysis of abelian von Neumann algebras, we now turn to their opposites, 
viz. factors. The main tool in the classification of factors, introduced by Murray and 
von Neumann, is a new partial ordering < on the projection lattice which is 

defined for general von Neumann algebras M. Unlike the familiar partial ordering 
< (see Proposition C.136), < gives a total ordering on 0^{M) if M is a factor. 

Definition C.143. Let be the projection lattice of a von Neumann algebra 

M. We say that e ^ f in iff there exists u G M such that u*u = e and uu* = f. 

Subsequently, we write e ^ f if there is e' G with e ^ e' and e' < /. 

It is easy to show that ~ is an equivalence relation. The operator u in this definition 
is unitary from eH to fH, vanishes on {eH)^, and has range fH. Such an operator 
is therefore a partial isometry (cf. Definition A.27), with initial projection e and 
final projection /. It follows that a necessary condition for e ~ / is that dim(e//) = 
dim{fH), but (unless M = B{H)) this is by no means sufficient, since the unitary 
u that maps eH to fH is required to lie in M. For example, if i/ = C © C, then 
e = diag(l,0) is equivalent to / = diag(0,1) with respect to M = M 2 (C), but not 
with respect to M = £> 2 ( 0 ) = C © C (i.e., the diagonal 2x2 matrices). 

To see how natural this definition is, consider a unitary representation m of a 
group G on H. If //, C H is stable under u(G), i = 1,2, then the restrictions m, of 
u to Hi are unitarily equivalent precisely when ei ^ e 2 with respect to M = ufG)' 
(where e, is the projection onto Hj). Furthermore, ui is unitarily equivalent to a 
subrepresentation of U 2 iff e\ < e 2 - More generally, if A C B(fL) is a von Neumann 
algebra, with stable subspaces Hi, i= 1,2, then the restrictions N to //, are unitarily 
equivalent iff ei ~ e 2 with respect to M = N', et cetera. 

One may compare projections in M with sets and compare <, and < with C 
(inclusion), = (isomorphism), and (the existence of an injective map), respec¬ 
tively. The Schroder-Bernstein Theorem of set theory (which von Neumann knew 
well) states that ifX^Y and Y ^ X, then X = Y. Similarly, it can be shown that: 

Proposition C.144. If e < / and f "f^e, then e ~ /. 

The special role of factors with respect to the partial ordering < now emerges. 
Proposition C.145. IfM is a factor, then < is a total ordering (i.e., e ^ f or f ^e). 
The property of a factor that leads to this result is: 

Lemma C.146. Let M be a factor. For any nonzero projections e,f G there 

are nonzero projections e' ,f G IP{M) such that e' <e,f'< f, and e' ~ f. 

The first step in the Murray—von Neumann classification of factors is as follows: 

Definition C.147. A projection e in M is called finite if f ^ e and f < e for some 
f G LP{M) implies f = e, and minimal if f < e, / G L^{M), implies f = e or f = 0. 

Accordingly, a factor M is called finite ifflM is finite, semifinite ifflivi majorizes 
a finite projection, and purely infinite iff all nonzero projections are infinite. 
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For M = B{H), which is evidently a factor, a projection e is (in)finite iff d\m{eH) 
is (in)finite, so that B{H) is finite iff H is finite-dimensional, and semifinite other¬ 
wise. Surprisingly, we will see that finite factors different from M„{C) exist, as do 
semifinite factors different from B{(?'). Even purely infinite factors (initailly defined 
as what was left out from the previous two cases) turn out to exist (even in physics). 
We first rephrase Definition C.147 in terms of generalized traces. 

Definition C.148. A trace on a von Neumann algebra M is a map 

tr:M+-> [0,°o] (C.584) 


satisfying 


tr{X ■ a + b) = X ■ tr{a) + tr{b) {a,b £ M+,X > O', (C.585) 

tr(flfl*) = tr(fl*fl) {a£M). (C.586) 

Equivalently, ii{uau*) = tv{a)for all a £ M+ and unitary u£M (so that uau* £ M+). 

A trace is finite iftv(a) < °ofor all a £ M+, semifinite if for any a £ M+ there is 
a nonzero b < a in M+for which ts:{b) < and infinite otherwise. 

The usual trace Tr is a trace tr on B{H) in this new sense, which is finite iff dim(//) 
is finite. As we will see, other factors admit other traces. The following result could 
have been used as a definition of (semi)finite and purely infinite factors. 

Proposition C.149. A factor is (semi)finite iff it admits a faithful u-weakly contin¬ 
uous (semi)finite trace, and is purely infinite otherwise. 

It can be shown that a finite trace on a factor is automatically cr-weakly continuous, 
so a factor is finite iff it admits a faithful finite trace. Hence we recover the fact that 
B{H) is finite iff dim(//) < oo, and semifinite otherwise. For a completely different 
kind of trace, defined on factors remote from B{H), we turn to discrete groups G. 
For these, Haar measure is simple the counting measure, so that Lf{G) = f^(G), and 
convolution (C.504) and involution (C.505), initially defined on Cc{G), are given by 

f*g{x) = Y^f{xy^'-)giy)\ /*(x) =/(x-i). (C.587) 

y^G 

According to Definition C.119, the reduced group C*-algebra C*{G) is the norm- 
closure of the *-algebra in B{Lf{G)) containing all operators 

= E if G CciG)). (c.588) 

yeG 

Thus C*{G) is realized as a concrete C*-algebra of operators on B{£^{G)), so that, 
following von Neumann himself, we may form the grouplvon Neumann algebra 

W*(G)=C;{G)''. (C.589) 
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Theorem C.150. The group von Neumann algebra W*{G) of a countable group is 
a factor ijfall nontrivial conjugacy classes in G (i.e., all except {e}) are infinite. 

In that case, we say that G has (or “is”) icc, i.e., has infinite conjugacy classes. 

Proof. From (C.587), for / G Cc{G) we have f * g = g * f for each g G Cc{G) iff 
fiyxy^^) = f{x) for all x,y £ G. In other words, / lies in the center of Cc{G) C 
W*{G) iff / is constant on each conjugacy class of G. If G is icc, this implies that 
/ can have support only at e, i.e., f = X ■ dg, X £ C. Noting that dg is the unit in the 
algebra Cc{G), this proves the claim, except for the fact that we should extend this 
argument from Cg(G) to W*{G), which by Theorem C.127 is its strong closure. 

The key to this extension is the fact that one has f*g{x) = for/ and 

g in Cc{G) , where Rxf{y) = f{yx) and the inner product is in f^(G). Hence 

|/*g(x)| = \{R,-ir,g)\< WRx-ifhWsh = WfhMi, (C.590) 

so that the sum in (C.587) is actually defined and converges (absolutely) for f,g£ 
f^(G). This also shows that if f„ strongly converges to some a £ B(£^(G)), i.e., 
Il/n * 0 for each y/ G 1^{G), then a^ = /*)//, where / G f^(G) is the 

limit of (/„) seen as a sequence in f’^(G). Hence W*{G) C £^{G), and the above 
computation of the center of W*{G) remains valid: we have / G W*{G) niF*(G)' 
iff / is constant on each conjugacy class of G. Conversely, any / that is constant 
on some finite conjugacy class (different from {e}) and zero elsewhere is central 
without being a multiple of the unit. □ 

Whether or not G has icc, we have a map tr : W* (G) — C, defined by 

tr(/)=/(e), (C.591) 

which satisfies (C.585) - (C.586) and hence defines a finite trace on W*{G). Also, 

tvif) = {5g,f*5g), (C.592) 

so this trace is (7-weakly continuous. 

Corollary C.151. IfG has icc, W* (G) is a finite factor non-isomorphic to any B(H). 

Since G must obviously be infinite for it to have icc, W* (G) is infinite-dimensional, 
and hence W* (G) ^ M„ (C) for any n G N. Furthermore, if H is infinite-dimensional, 
then B{H) does not admit any cr-weakly continuous finite faithful trace: 

Proposition C.152. Any two nonzero o-weakly continuous (semi)finite traces tr,tr' 
on a (semi)finite factor are proportional, i.e., tr' = Xtvfor some X £ 

See also Theorem C.155 below. Consequently, since Tr and tr are both cr-weakly 
continuous, and Tr(l//) = oo on B{H), whereas tr(1^2((j)) = 1 on W*{G), we con¬ 
clude that W*{G) ^ B{H) for any H. Note also that (still assuming that G has icc), 
all projections in W*{G) are finite, and W*{G) has no minimal projections (see be¬ 
low), whereas B(H) has both finite and infinite projections, and also has plenty of 
minimal projections, namely those with one-dimensional range. 
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Do such “icc” groups actually exist? In fact, there are infinitely many of them: 
each free group on n > 1 generators is an example. Another example is the group 600 
of finite permutations of N. A j-cycle is a cyclic permutation of j objects (called the 
carrier of the cycle in question). Any element p of ©„o = U„©„ is a finite product 
of j-cycles with disjoint carriers, and for each j G N, the number of y-cycles in such 
a decomposition of p is uniquely determined by p. Two permutations in 600 , then, 
are conjugate iff they have the same number of j-cycles, for all J G N 

We present the type classification of factors due to Murray and von Neumann. 

Definition C.153. A factor M is said to be of type: 

• I if it has at least one minimal projection, subdivided into: 

- Type I« (n G N) ifM is finite and \m is the sum of n minimal projections. 

- Type loo ifM is type I and semifinite but not finite. 

• It if it has no minimal projections, but has some nonzero finite projection, with: 

- Type III ifM is type II and finite. 

- Type IIoo ifM is type II and semifinite but not finite. 

• III if all nonzero projections are infinite. 

A nice understanding of these types arises from a construction similar to the trace. 

Definition C.154. A dimension function on a von Neumann algebra M is a func¬ 
tion d : t^(M) —>■ [ 0 ,ool such that d(e) < °° iff e is finite, d(e -\- f) = die) -\-d{f) if 
ef = 0 (i.e., eH _L fH), and die) = dif) ife ~ /. 

Paraphrasing results in Murray and von Neumann’s great series of papers, we have: 

Theorem C.155. For any von Neumann algebra M, the restriction of a trace to 
tf*iM) is a dimension function. IfM C BiH) is a factor, with H separable, then: 

1. Any (J-weakly continuous trace on M restricts to a completely additive dimension 
function with the additional property that die) = dif) if and only if e ~ /. 

2. Any dimension function with this additional property arises from a O-weakly 
continuous trace, and hence is completely additive, and unique up to scaling. 

3. In that case, the dimension function d induces an isomorphism between I^iM) j ~ 
and some subset of [0, 0 °]. Suitably scaling d, this subset must be one of: 

• {0,1,2,..., n}, for some n G N (type l„). 

• N U 00 (type loo). 

• [0,1] (type III). 

• [0, 00 ] (type IIoo). 

• {0,°°} (type 111). 

We may now strengthen the few examples we had so far in the following way: 

Corollary C.156. • IfdimiH) = n, then BiH) is a factor of type l„. 

• IfdimiH) = 00 , then BiH) is a factor of type loo. 

• Let G be icc. Then W*iG) is a factor of type III. 
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C.23 Classification of hyperfinite factors 

Throughout this section we assume that our von Neumann algebras M C B{H) act 
on a separable Hilbert space H. We say that M is hyperfinite if M = (U„M„)", for 
a family of finite-dimensional von Neumann subalgebras C M with M„ C M^+i- 
For example, M = B{H) is hyperfinite. If G is a group such that G = U„G„ for finite 
subgroups Gn C G„+\, as is the case e.g. for the (icc) group = U„©„ of finite 
permutations of N, then the associated von Neumann algebra W* (G) is hyperfinite. 
Murray and von Neumann partly classified hyperfinite factors, as follows: 

Theorem C.157. LetM C B(H) be a hyperfinite factor. 

• IfM is type l„, then M = M„ (C). 

• IfM is type loo, then M ^ B{f^). 

• IfM is type IIi, then M = W* (©oo). 

The unique hyperfinite Ili-factor lT*(©oo), which turns out to be isomorphic to 
W*{G) for any finitely generated icc group G, is usually called R. Similarly (and 
trivially), B{f^) = B{H') for any separable infinite-dimensional Hilbert space H'. 

An example of a hyperfinite Hoc factor is also quickly found, viz. M ~ R(SiB{f^), 
but Murray and von Neumann were unable to classify such factors. About type III, 
they knew almost nothing, except for a couple of examples from ergodic theory. 
Between 1971-1975, Connes made two decisive steps forward in this area: 

1. Dividing type in factors into III;^, X G [0,1], by means of a new invariant. 

2. Completely classifying hyperfinite type Hoc and type III factors, as follows: 

• There is a unique hyperfinite lloo factor, namely 

• There is a unique hyperfinite Illi factor (Connes and Haagerup). 

• There is a unique hyperfinite III;^ factor for each X G (0,1). 

• There is an infinite family of hyperfinite IIIq factors, completely classified by 
the so-called ^ow of weights introduced by Connes and Takesaki. 

We list nil separately from III^ for A G (0,1) for two reasons: first, “hyperfinite 
nil” turns out to be the factor occurring in quantum field theory and quantum statis¬ 
tical mechanics of infinite systems, whereas lll;^ for A G (0,1) seems artificial) and 
second, the proof of uniqueness of the hyperfinite llli factor is much more difficult. 

An important technical tool of Connes was his own profound discovery that a 
von Neumann algebra M C B{H) is hyperfinite iff it is injective, in that there exists 
a (7-weakly continuous conditional expectation E : B{H) -G M, that is, a linear 
map E : B{H) -G B{H) such that E{a) G M and E{a*) = E{a)* for all a G B{H), 
E^ = E, and \\E\\ = 1. It follows that E{abc) = aE{b)c for all a,c G M,b G B{H). 
The equivalence of hyperfiniteness and injectivity implies, for example, that if M = 
N(SiB{ f^) is hyperfinite, then so is N. Another crucial tool was the Tomita-Takesaki 
theory, which we briefly summarize (this theory was paralleled by simultaneous 
and independent work in mathematical physics by the German-Dutch mathematical 
physics trio Haag-Hugenholtz-Winnink, which among other things allowed a direct 
definition of thermal equilibrium states in infinite volume, see §9.6. 
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Definition C.158. A von Neumann algebra M C B{H) is in standard form if H 
contains a unit vector Q that is cyclic and separating for M. 

Recall that Q is separating for M if aQ f 0 for all nonzero a & M, and that Q is 
cyclic for M iff it is separating for M'. Any von Neumann algebra can be brought 
into standard form. For separable H, this follows by picking an injective density 
operator p on H, whose associated state co(a) = Tr (pa) is faithful (in that co(a*a) > 
0 for all nonzero a G M), and passing to the GNS-representation na,(M) = M. For 
example, M — B(H) acting on H is not in standard form, but acting on B 2 (H) by left 
multiplication it is, where B 2 {H) is the Hilbert space of Hilbert-Schmidt operators 
on H with the familiar inner product {a,b) =Ti{a*b).lf p G B\(H) is an injective 
density operator on H, then Q = y/p G B 2 {H) brings M into standard form. In this 
case, M' = B{HyP (where the suffix “op” means that multiplication is done in the 
opposite order, i.e. ab in B(H)°^ is equal to ba in B(H)), which acts on B 2 {H) by 
right multiplication. \f H = C", one simply has B(H) = B 2 (H) — M„(C). 

Let M C B(H) be in standard form. Tomita introduced the (unbounded) antilinear 
operator S as the closure of the operator Sq having domain D(So) = MQ and action 

So{aQ)=a*Q. (C.593) 

This domain is dense because Q is cyclic for M, the action is well defined since 
Q is separating for M, and 5o indeed turns out to be closable, with closure S. Any 
closed operator a has a polar decomposition a = v\a\, where v is a partial isometry 
and \a\ = sja*a. We write the polar decomposition of the above operator S as 

S = JA'^I^, (C.594) 

where J is an antilinear partial isometry, and A =S*S. Since S is injective with dense 
range, J is actually anti-unitary, satisfying J* = J and = 1. Furthermore, A >0, 
so that A“ is well defined for f S M: writing A = exp(h) for the self-adjoint operator 
h = logA, we have A'' = exp(ith). We then have the Tomita-Takesaki Theorem: 

Theorem C.159. LetM C B(H) be a von Neumann algebra in standard form. Then: 

• M' =JMJ = {JaJ\aGM}. 

• For each f G R and a G M, the operator (Xt(a) = A'‘aA^“ lies in M. 

• The map t i-G- CCt is a group homomorphism from R to Aut(M) (i.e., the group of 
all automorphisms ofM), which is continuous, in that for each aGM the function 
1 1 — (Xt(a) from R fo M (with o-weak topology) is continuous. 

The image of R in Aut(M) by a is called the modular group of M associated with 
the cyclic and separating vector £2 (or rather, with the associated cr-weakly con¬ 
tinuous faithful state (o). Simple examples show that the modular group explicitly 
depends on the vector f2. In his thesis, Connes analyzed the dependence of a on f2, 
and showed it was innocent. To state the simplest version of his result, assume that 
H contains two different vectors Q.\ and £12, each of which is cyclic and separating 
for M. We write for the modular group derived from f2,, i= 1,2. 
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Theorem C.160. There is a family Ut of unitary operators in M (t G Mj, such that 

aj^\a) = Ut(4^\a)u;; (C.595) 

U,+, = U,aP{Ut). (C.596) 


Proof The proof of this theorem is Connes’s favourite (as he declared in an inter¬ 
view), so we present it in some detail. It is based on the following idea. Extend M 
to Mat 2 (M), i.e., the von Neumann algebra of 2 x 2 matrices with entries in M, and 
let Mat 2 (M) act on H 2 = H (BH in the obvious way. Subsequently, let Mat 2 (M) 
act on //4 = ©//©//©// = //2 © by simply doubling the action on H 2 . The 

vector Q. = (f2i, 0,0, f 22 ) G 7/4 is then cyclic and separating for Mat 2 (M), with cor¬ 
responding modular operator A = diag{Ai, A 4 , At,, A 2 ). Here Ai and A 2 are just the 
operators on H originally defined by and Q 2 , respectively, and A^ and A 4 are 
certain operators on N. Denoting elements of Mat 2 (M) by 


/flu fli2\ 

Vfl2i fl22 J 


we then have 


A" 


a 0 
0 a 


A-it 

a/‘^(a) 

«P(a) 


f 0 \ . 

V 0 «/^^(a)/ 

fA‘‘auAf‘' AfanAf" \ 
[A‘‘a2iAf‘‘Af a22Afy ’ 
fAfauAf’^ AfanAf" \ 

\A2a2lA^ “ A 2 022^2 “J 


(C.597) 


(C.598) 

(C.599) 

(C.600) 


But by Theorem C. 159, the right-hand side of (C.598) must be of the form diag(b, b) 
for some b G Mat 2 (M), so that ci^^^^(a) = (a). This allows us to replace Af a 22 Af‘‘ 

in (C.599) by A 2 022 ^^“. We then put Ut = AfAf“, which, unlike either or Af'*, 
lies in M, because each entry in (a) must lie in M if all the atj do, and here we 
have taken fli 2 = 1- All claims of the theorem may then be verified using elementary 
computations with 2x2 matrices. For example, combining 


/fl0\ _ /O 1\ /00\ /00\ 

l^ooy \ooy l^Oflj 1^1 oy 


(C. 601 ) 


with the property (ab) = af’ (aja,''^'' (b), we recover (C.595). Using the identity 




.0) 


(0Ut\ ^ /^O 1\ /O 0 \ 

Vo 0 yl Voo/vot^^J’ 


(C.602) 


evolving each side to time s yields (C.596). A proof from The Bookl □ 
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We say that an automorphism 7 ; M —> M is inner if there exists a unitary element 
u G M such that 7 (a) = uau* for all a G M. The inner automorphisms of M form a 
normal subgroup Inn(M) of the group Aut(M) of all automorphisms, with quotient 
Out(M) = Aut(M)/Inn(M). Theorem C.160 shows that the image ;r(a(]R)) of the 
modular group in Out(M) under the canonical projection n : Aut(M) —^ Out(M) is 
independent of Q, and invariants of this image will be invariants of M itself. 

Such invariants are trivial if M is a factor of type I or It, since in that case 

;r(a(]R)) = {e}; to see this in the finite case (i.e., type l„ or type III), take a finite 

trace T on M and check that A = 1 for = M. For the semifinite but not finite 

case (i.e., type loo or type lloo), a slight generalization of the GNS-construction leads 
to the same conclusion. To find invariants for type III factors, we therefore need to 
extract information from the modular group f 1 —up to inner automorphisms. 

Definition C.161. Let a : M —>■ Aut(M) be a continuous action of M on M, defining: 

M“ = {x G M \ at{x) = xVf G K}; (C.603) 

Me = {x G M j xe = ex = x} (e G (C.604) 

• The Arveson spectrum sp(a) of a consists of all p G K/or which there is a 

sequence {x„) in M with \\x„\\ = 1 anc/lim„^oo ||o:r(x„) — = OVf G M. 

• For each e G the map (Xt : M M restricts to (xf : Mg -X Mg, defining a 

(group) homomorphism : K —>■ Aut(Mg), 1 1 —>■ af. The Connes spectrum of a 

isr{a) =exp(r'(a)) C M+, where F' (a) = no7eG/>(M“) cM. 

The Connes spectrum F(a) is a closed subgroup of K+, which has the great virtue 
that if ;r(a(]R)) = ;r(a'(]R)), then r(a) = F(a'). So if a is the modular group of 
M with respect to some state O), then F(a) is independent of O), and may therefore 
be called F(M). This invariant can also be defined through the usual spectrum of 
self-adjoint operators on Hilbert space. To this effect, Connes defined and proved 

S(M) = nc7(Aa,)= n (C-605) 

10 0^ee^(MO‘) 

where the first intersection is over all (7-weakly continuous faithful states co on M, 
whereas in the second one takes a fixed (7-weakly continuous faithful state (p on M, 
and restricts it to (pe = . Furthermore, A® denotes the operator A on Hay, defined 

with respect to the usual cyclic unit vector Qa of the GNS-construction, etc. If M is 
a type I or II factor, one has S{M) = {1}, whereas 0 G S{M) iff M is type III. 

Connes showed that F(M) = S{M) nK+, and the known classification of closed 
subgroups of R+ yields his path-breaking parametrization of type III factors: 

Definition C.162. Let M be a type ill factor. Then M is said to be of type: 

• lllo;/r(M) = {1}; 

• III;l> where X G (0,1), ifFlM) = X"^; 

. iiiii/r(M)=M+. 

The unique hyperfinite Illi factor appears throughout algebraic quantum field the¬ 
ory, where it plays the role of a universal algebra of localized observables. 
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C.24 Other special classes of C*-algebras 

There are many other special classes of C*-algebras apart from von Neumann alge¬ 
bras and commutative C*-algebras. The classes we consider here contain both com¬ 
mutative and non-commutative C*-algebras; in the spirit of (exact) Bohrification, 
whenever possible we try to characterize them through properties of their (maximal) 
commutative subalgebras. Like the von Neumann algebras already studied, each 
class in this section is sandwiched between the finite-dimensional C*-algebras, i.e. 
those C*-algebras that are finite-dimensional as a vector space (which it contains), 
and the real rank zero C*-algebras defined below (in which it is contained). 
Finite-dimensional C*-algebras admit a straightforward classification; 

Theorem C.163. Every finite-dimensional C*-algebra A is isomorphic to a direct 
sum of matrix algebras, i.e., A = (BkAIn^{C), where n^ G N, and the sum is finite. 

Proof. Let A be a finite-dimensional C*-algebra, and take the injective representa¬ 
tion % = 0(og/)(A) Ttffl on He = ®(oeP{A)^oi>^ where P{A) is the pure state space of 
A; cf. the last stage of the proof of Theorem C.87. The proof now unfolds: 

1. Since Ha, is the closure of n(a{A)Ela>, it must be finite-dimensional. 

2. Since each (O is pure, by Theorem C.90 we must have TtaiA)” = B{Ha,). 

3. By Theorem C.127, na,{A)" equals the weak or strong closure of n(o{A), but since 
this algebra is finite-dimensional by step 1, these closures coincide with ndA), 
and hence 7t(o{A) = B{Hio) = M„(C), where n = dim(//ffl). 

4. One can find an injective subrepresentation tt, of n using only a finite number of 

pure states (proof by contradiction to dim (A) < so that ;r,(A) = A. □ 

The real rank of a C*-algebra A is a non-commutative generalization of the 
(Lebesgue) covering dimension of a non-empty space X, defined as follows. First 
say that dim(2f) < n iff every open cover of X has an open refinement fit for 
which every x G 2f is contained in at most n -f 1 elements of . We then say that 
dim(2f) = n iff dim(2f) < n but dim(2f) ^ n — 1 (such n need not exist). 

If 2f is a compact Hausdorff space, then dim(2f) = n iff n is the smallest integer 
n such that for every / G C(2f,K"+') and e > 0, there is g G C(2f,]R"+') such that 
g{x) f 0 for all X and ||/ —g|loo < e, where ||/||oo = sup^g;f{|/(x)|}. If no such n 
exists, we say that dim(2f) = oo. If g : 2f —>■ ]R”+^ is described by its coordinates 
(gi,... ,g„+i), then g(x) ^ 0 iff 8k{xf > 0, or equivalently, Y^kSl is invertible 
in C{X). We may replace the usual norm ||v|| in K"+* by the equivalent max-norm, 
i.e., ||v|| = max;{|v,j}, where v = (vi,..., v„+i). If we do so, we may generalize the 
covering dimension to possibly noncommutative unital C*-algebras, as follows. 

Let A" = A © • • • © A (with n terms) be the C*-algebra A x • • • x A with pointwise 
operations and norm || (ai,... ,a„)|| = max,{||fl,j|}. Let Q{A^) be the set of all self- 
adjoint elements (ai,..., a„) in A" for which a? is invertible (i.e. in A). The real 
rank rr(A) of a unital C*-algebra A is defined as the smallest integer n for which 
2(A"+') is dense in A"+', i.e., if for every a G A”+^ and e > 0, there is b G 0(A"+*) 
such that ||a — b|| < e. If no such n exists, we define rr(A) = oo. If A has no unit, we 
define its real rank as rr(A) = rr(A), i.e., as the real rank of its unitization. 
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Taking A = C{X), it follows from the previous paragraph that 

rr(C(X)) =dim(X). (C.606) 

Now dim(X) = 0 iff X has a basis of clopen sets, and if X is compact Hausdorff, 
then dim(X) = 0 iff X is a Stone space. Hence from (C.606) we immediately have: 

Proposition C.164. If A is a commutative C*-algebra, rr(A) = 0 iffE{A) is Stone. 

This makes dimension zero somewhat pathological. On the other hand, for non- 
commutative C*-algebras real rank zero is ubiquitous. Note that if a = a* and 
is invertible, then its inverse is positive, too, and has a square-root which inverts a. 
Thus A has real rank zero iff its invertible self-adjoint elements are dense in Asa. 

Proposition C.165. Any von Neumann algebra has real rank zero. 

Proof. For a € Asa and e > 0, with A C B{H), use Theorem B.102 to define 

^ “ (ttlo-(a) T ( 2 ^ ■ ^O'(a) itl(j(a)) ’ 1[—£/2,e/2])(®)- (C.607) 

Using (B.322), we may then compute 

||fl~^|| ^ 112^' ^<r(a) ~ ^ [-e/2,e/2] 11“= 

— II 2^ ' ^cT(a) ll“= l|i^cT(a) ’ l[-e/2,e/2] ll“= 

<ie-fie = e. (C.608) 

Writing (C.607) as b = f{a), the function / G £§{G{a)) satisfies f{x) = x if x f: 
[—e/2,e/2] and /(x) = ^e if x G [—e/2,e/2]; either way, /(x) f 0. Hence / is 
invertible in Il§{a{a)), and therefore b = f{a) is invertible in W*(a) and in £(//).□ 

We now turn to classes of C*-algebras that are sandwiched between the finite¬ 
dimensional ones at the lower end and those with real rank zero at the upper end. 

Definition C.166. Let Abe a unital C*-algebra. Then A is said to be: 

1. Finite-dimensional if it is finite-dimensional as a vector space. 

2. AF (Approximately Finite-dimensional) if it is the norm-closure of the union of 
some (not necessarily countable) directed set of finite-dimensional C*-subalgebras. 

3. Scattered if every a G Asa has countable spectrum. 

4. A W*-algebra if it is the dual of a Banach space, and a von Neumann algebra 
if it is a represented W*-algebra, i.e., A C B(H); this can always be achieved. 

5. Monotone complete if every upward directed bounded subset in Asa (under the 
usual order <) has a least upper bound (i.e. supremum). 

6. AW* if for each nonempty subset S <Z A there is e G /?^(A) so that R(S) = eA. 

7. Rickart if for each a G A there is a projection e G 3^(A) so that R{a) = eA. 

8. Real rank zero if its invertible self-adjoint elements are dense in Asa. 

Here a subset 5 of a poset P is upward directed if for each x,y G S there is z G S 
such that X < z and y < z (for example, this is true in a complete lattice). 
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Furthermore, the right-annihilator R{S) of S C A is defined as 

R{S) = {aeA\ba = 0ybGS}, (C.609) 

and R{a) = /?({«}); in the presence of an involution, equivalent definitions may be 
given in terms of the left-annihilator. In all cases, the projection e is unique. Since 
Rickart himself already showed that A is Rickart iff for each nonempty countable 
subset S C A there is e G t^{A) so that R{S) = eA, the difference between Rickart 
and AW* lies in the countability assumption on S in the former but not in the latter. 

It is known that if a C*-algebra A has a faithful representation on a separable 
Hilbert space, then it is a Rickart C*-algebra iff it is an AW*-algebra, but other¬ 
wise these classes are different. Similarly, an AW*-algebra is a W*-algebra iff it 
has a separating family of normal states, where normality of functionals on AW*- 
algebras is defined as in Definition 4.11, i.e. through complete additivity on or¬ 
thogonal familes of projections, which always have an upper bound (cf. Theorem 
C.169 below). This is the case in all examples relevant to mathematical physics, but 
set-theoretically the class of AW*-algebras has higher cardinality than the class of 
W*-algebras it contains. It is generally believed that a C*-algebra is Rickart iff it is 
monotone cr-complete, and that it is AW* iff it is monotone complete, but there are 
neither proofs of nor counterexamples to these claims. We have the inclusions: 

W* C monotone complete C AW* C Rickart C real rank zero; 

AF C real rank zero; 
scattered C real rank zero. 

Scattered C*-algebras may alternatively be characterized as those C*-algebras 
on which every state is a w*-convergent convex sum of pure states; this condition is 
far stronger than what the Krein-Milman theorem gives, namely that every state is 
a w*-limit of some net consisting of finite convex sums of pure states. For example, 
for any Hilbert space the compact operators Bo{H) form a scattered C*-algebra 
(extending the definition of the latter to the non-unital case as appropriate). 

Two kinds of results are of interest for Bohrification: one is the topological char¬ 
acterization of the commutative case of each class, the other is the characterization 
of the class itself through properties of its commutative subalgebras. Without proof 
we state what is known in this respect. 

Theorem C.167. Let A be a commutative unital C*-algebra. Then A is: 

1. Finite-dimensional ijfLiA) is finite (with discrete topology). 

2. AF iffZ(A) is a Stone space. 

3. Scattered iffZ(A) is scattered. 

4. A W*-algebra or a von Neumann algebra ijfE (A) is hyperstonean. 

5. Monotone complete ijfL{A) is stonean. 

6. AW* ijfE{A) is stonean. 

7. Rickart iffE(A) is o-stonean. 

8. Real rank zero iffE (A) is a Stone space. 
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Here we used the convention that a Stone space is a zero-dimensional compact 
Hausdorff space (equivalently, it is compact Hausdorff and totally disconnected in 
the sense that the only connected subsets are points). A (a-) stonean space is a 
Stone space with the additional property that Clopen(Z(A)) is a (CJ-) complete lat¬ 
tice (equivalently, a stonean space is a compact Hausdorff space that is extremally 
disconnected in that the closure of each open set is open). Furthermore, a space is 
hyperstonean if it is stonean, and for any nonzero / G C(X,]R+) there exists a com¬ 
pletely additive positive measure p such that p{f) > 0. In particular, in the com¬ 
mutative case the classes AF and real rank zero coincide, as do AW* and monotone 
complete algebras. A space X is called scattered if each non-empty closed subset 
C CX contains an isolated point (i.e., a point x GC with an open neighbourhood U 
such that t/nC = {x}). If X is scattered, then it is totally disconnected. An example 
of a compact scattered space is (1/N) U {0} with the relative topology from K. 

This leads to the following generalization and extension of Theorem C.141. 

Theorem C.168. Let A be a commutative unital C*-algebra. The projections 3i*{A) 
in A form a Boolean lattice, which is related to the Gelfand spectrum 2^ (A) through 

^{A) Clopen(2;(A)). (C.610) 

If A is also AF, then its Gelfand spectrum X{A) is a Stone space, and we have 

Z{A)'^y{^{A)); (C.611) 

&{X{A)) ^ \di\{^{A))-, (C.612) 

A C(J?^(J^(A))), (C.613) 

as topological spaces, frames, and (commutative) C*-algebras, respectively. 

Conversely, for any Boolean lattice L the C*-algebra C{S^{L)) isAF, and 

L'^ ^{C{y{L))). (C.614) 

Proof Using the Gelfand isomorphism A = C{E{A)), eq. (C.610) follows from 

^{C{X)) ^ Clopen(X), (C.615) 

where X is some compact Hausdorff space. Indeed, if e^ = e* = e G C{X), then e 
must be {0, l}-valued, so it must be e = Ij/ for some U C X, viz. U = e^*({l}). 
Since eGC{X)is continuous, U must be clopen. Conversely, for each U G Clopen(2f), 
the function 1 t/G C(2f) is a projection, and the maps G l/j and eH^e * ({1}) are 
each other’s inverse. Theorem D.5 then implies that ^(A) is Boolean. 

If A = C(X) is AF, then C(X) = (U^A^)^ is the norm-closure of the union of 
finite-dimensional C*-algebras A;^, which union by the Stone-Weierstrass theorem 
separates points of X. Since each A;^ Ihe linear span of its projections, the finite¬ 
dimensional projections ) already separate points in X, and this in turn 

implies that X is totally separated, i.e., for each xf^yGX, there isU G Clopen(2f) 
such that X G U and y ^ U. Since a compact Hausdorff space is zero-dimensional 
(and hence Stone) iff it is totally separated, 2f is a Stone space. 
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Again using Theorem C.8, we only need to prove (C.611) in the special case 

X^J^(^(C(X))), (C.616) 

where X is a Stone space; this follows from (C.615) and Theorem D.5. Eq. (C.612) 
follows from (D.35), whilst (C.613) is immediate from (C.611) and Theorem C.8. 

Finally, using Theorem D.5 we see that (C.614) reduces to (C.615), so we only 
need to prove that C(X) is AF for any Stone space X. This is just the above proof 
of the converse ran backwards: since X is totally separated, for each x ^ y we find 
U G Clopen(X) separating x and y, so that also the associated projection 1 [/ separates 
X and y, and hence ^(C(X)) separates X. Taking A to label the finite subsets of 
^(C(X)), and to be the finite-dimensional C*-algebra generated by A G A, by 
Stone-Weierstrass we have C(X) = ■ Hence C(X) is AF. □. 

Theorem C.169. The claim that a unital C*-algebra lies in class SF iff each of its 
maximal abelian *-subalgebras lies in class SF is true for the following classes: 

1. Finite-dimensional C*-algebras. 

2. Scattered C*-algebras. 

3. von Neumann algebras. 

4. AW*-algebras. 

5. Rickart C*-algebras. 

The claim is false for AE-algebras, true for monotone complete C*-algebras iff these 
coincide with AW*-algebras, false for real rank zero C*-algebras, and unknown for 
W*-algebras, which we therefore state as a conjecture: 

A C*-algebra is a W*-algebra iff each maximal abelian *-subalgebra is a W*-algebra. 

Proposition C.170. For any C*-algebra A and any projections e,f G if^(A), we 
have ef = e iff e < f (withpartial ordering < as defined in Asa via A+, cf. §C.7). 

Proof. As explained above (C.93), if ai < 02 , then b*aib< b*a 2 b, so e < / implies 

(u - /)e(u - /) < (U - /)/(U - /) = 0. (C.617) 

However, since e^ = e* = e, with c = e(l/i — /) we have (l^ — /)e(l/i — f) = c*c, 
and hence (1^ — f )e = 0 (as c*c > 0), or e = fe. Taking adjoints gives ef = e, and 
consequently ef = fe. Conversely, if ef = e, we have ef = fe and hence 

{f-ef=f-2effe = f-e. (C.618) 

Of course, f—e = (f — e)*, so that (C.618) makes f — e a projection. Since any 
projection lies in A+, we have f — e >0, and hence e </• □ 

The set of projections .^(A) in a C*-algebra is always a poset in the order <, but 
it is not automatically a lattice. It is a a-complete lattice if A is Rickart, and hence 
also in all “lower” classes, including von Neumann algebras (cf. Proposition C.136), 
where if^{A) is even a complete lattice. 
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C.25 Jordan algebras and (pure) state spaces of C*-algebras 

Let A be a unital C*-algebra. As we know, the state space S{A) is the set of all 
states on A, seen as a compact convex set in the w*-topology inherited from the 
embedding S{A) C A* (note that S{A) fails to be compact if A lacks a unit). To see 
which information S{A) carries about A, we need to impoverish A as follows. 

Definition C.171. A Jordan algebra is a real commutative (but generally non-asso- 
ciative) algebra A whose product o satisfies (writing a^ = ao a): 

ao(boa^) = (aob)oa^. (C.619) 

A JB-algebra is a Jordan algebra that is also a (real) Banach space such that: 

\\aob\\ < \\a\\\\b\\-, (C.620) 

||flf < + (C.621) 

Given (C.620), axiom (C.621) is equivalent to ||fl2|| < \\a^ +b^\\ and ||fl2|| = ||a||2. 

It is easy to see that the self-adjoint part Asa of any C*-algebra A is a JB-algebra if 
we put aofo = ^(ab + ba), cf. (5.14). If A and B are unital C*-algebras, we say that a 
linear map (p : Asa —> Bsa is a Jordan homomorphism if it preserves o; to this effect 
it clearly suffices that (p{a^) = (p{a)^ for each a. If (p in addition is bijective, then it 
is called a Jordan isomorphism', in that case its inverse is necessarily linear and also 
preserves te Jordan product o. A ioidanjordan automorphism of a C*-algebra A is 
a Jordan isomorphism Asa ^sa- Of course, we may complexify (p : Asa ^sa so 
as to obtain a C-linear map (pc'-A—^-B that equally well satisfies (p£{a^) = (pc(a)^, 
this time for all a G A (rather than all a G Asa). However, the conceptual point here 
is that quantum-mechanical observables are supposed to be self-adjoint, and that 
the Jordan product (but not the ordinary associative product) always preserves self¬ 
adjointness. Generalizing Proposition 5.19, we then have the key result: 

Theorem C.172. Let A and B be unital C*-algebras. There is a bijective correspon¬ 
dence between Jordan isomorphisms (p : Asa ^sa ond ajfine homeomorphisms 
f : S{B) —>■ 5(A), given by f = (p* (i.e. f((o)(a) — 0}{(p{a))). In particular, each 
ajfine homeomorphism of S{A) is induced by a Jordan automorphism of A. 

The proof is similar to Proposition 5.19; generalizing Lemma 5.20 we now have: 

Lemma C.173. Let A and B be unital C*-algebras. Then f — (p* gives a bijective 
correspondence between affine bijections f : S{B) —>■ 5(A) and unital positive linear 
bijections (p • -^sa ^ Bsa- Moreover, if (p • -^sa ^ Bsa is a unital linear bijection, then 
(p is positive iff (p is isometric iff (p is a Jordan isomorphism. 

Most of the proof is practically the same as for Lemma 5.20 (so we omit it), ex¬ 
pect for the last equivalence between invertible unital isometries and Jordan isomor¬ 
phisms, which is deeper and relies on Kadison’s inequality <p(a*a) > <p(a)* <p(a) for 
positive unital linear maps <p between C*-algebras and normal operators a. 

A similar result is Hamhalter’s generalization of Dye’s Theorem to AW*-algebras: 
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Theorem C.174. Let A and B be AW*-algebras and let N ; LP{A) —>■ 3^{B) be an 
isomorphism of the corresponding orthocomplemented projection lattices that in 
addition preserves arbitrary suprema. If A has no summand isomorphic to either 
or M 2 (C), then there is a unique Jordan isomorphism J : Asa “4 Bsa that extends N 
(and hence Jordan isomorphisms are characterized by their values on projections). 

This generalizes Corollary 5.22 in the main text, but has a much more difficult proof. 

Proof If e,f G i^(A) are orthogonal, then so are N(e) and N(/), so that 

N(e + f) = N(e) + N(f). (C.622) 

Gleason’s Theorem for AW*-algebras then gives a Jordan homomorphism 

J(,j):AW*(e,f)sa^Bsa, (C.623) 

where AW*(e,f) is the AW*-algebra generated by e, /, and the unit 1^, which in 
particular preserves all Jordan triple products 

{a,b,c} = {aob)oc + ao{boc) — bo{aoc), (C.624) 

which in terms of the usual operator product equals \ {abc + cba). This implies 

N((U-2e)/(U-2e)) = (lB-2N(e))N(/)(lB-2N(e)), (C.625) 

which (in the second major step of the proof, after the application of Gleason’s 
Theorem) is necessary and sufficient for (p to extend to a Jordan isomorphism. □ 

The structure of Jordan isomorphisms may be inferred from the following re¬ 
markable result, in which a linear map (p : A -G B between C*-algebras is called an 
anti-homomorphism of <p{a*) = <p{a)* as usual, but <p{ab) = (p{b)(p{a). 

Theorem C.175. If (p : Asa ^ A a Jordan homomorphism (where A is a C*- 

algebra and H is a Hilbert space), there exist three mutually orthogonal projections 
e\, ez, ct, in the center ^(A)'n^(A)" of the von Neumann algebra (p(A)", such that: 

1. ei +e2 + e3 = 1 //; 

2. The map a i—>■ (pc{a)ei from A to B(^e\H) is a homomorphism (of C*-algebras). 

3. The map a i—>■ (pc{a)e 2 from A to B(e 2 H) is an anti-homomorphism (ibid.). 

4. The map a i—>■ (pc{a)e 2 , from A to B{e 2 ,H) is both a homomorphism and an anti¬ 
homomorphism of C*-algebras (so that (pc{A)e 2 is commutative). 

If in addition a i— (pc{a)ei is not an anti-homomorphism and a i—>■ (pc{a)e 2 is not a 
homomorphism, then ei, e 2 , and e^ are uniquely determined by these conditions. 

Like the previous theorem, the proof of this one exceeds the scope of this book. 

Corollary C.176. Let J : B{H)sii —> B(//)sa be a Jordan isomorphism. Then Jc : 
B{H) —>■ B{H) is either a homomorphism or an anti-homomorphism of C*-algebras. 

Proof. The center of B{H) is trivial, so either ei = \h or e 2 = 1//. □ 
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pure state space P{A) = deS{A) is the extreme boundary of the state space S{A). 
According to the Krein-Milman Theorem B.50, P{A) is not empty, and 

S{A) = {codeP{A))-, (C.626) 

see (B.165) for notation. In order to recover S{A) from P{A), the latter obviously 
needs more structure than just that of a set. First, it inherits the w*-topology from 
A*, but it turns out that we need to equip P{A) with the more refined w*-uniformity. 

In general, a uniform structure on a set X (also called an entourage uniformity) 
is a nonempty filter on A x A (i.e.,a collection C x X) of subsets of 

X xX such that U G and U CV imply V G and U Gf^ and Y Gf/ imply 
U GW GfA) satisfying the following conditions: 

1. Each U GfA contains the diagonal Ax = {{x,x) \x GX}-, 

2. \fUG fA, then S fA, where t/^ = {(y,x) | (x,y) GU}\ 

3. \iU Gf^ , then there is some V GfA such that C U, where 

= {{x,z) I 3y G A : (x,y) G V, (y,z) G V}. (C.627) 

A set with a uniformity is called a uniform space. If X and Y are uniform spaces, a 
function / : A —> T is uniformly continuous if (V) G fYx whenever V G 

The w*-unformity on A*, where A is any Banach space, is the smallest one 
containing all subsets of the type 

{i(p,(p') G A xA : \(p{a)-(p\a)\ < e}, (C.628) 

where a G A and e > 0; this implies that U G iff U contains some such subset. 

Second, P(A) carries a natural transition probability, cf. Definition 1.17 and 
(2.43). For (O, (o' G P(A), this function T : P(A) x P(A) —>■ [0,1] is defined by 

T(a), 0)') = inf{a)(a) | a G A,0 <a< lA,co'(a) = 1}. (C.629) 

This definition, and the following result, are valid even if A has no unit. 

Proposition C.177. Let A be C*-algebra and define X by (C.629). Then 

x{(o,(o') = l-\\\(o-co'f, (C.630) 


and the following dichotomy applies: 

1. If(0 and (o' are equivalent (in the sense that the corresponding GGIS-representations 

Kco and K(f,i are unitarily equivalent), so that we may assume that the associated 
cyclic vectors Qg, and Qn,' same Hilbert space, we have 

xi(0,(o')=Tv{en,enJ = (C.631) 

2. If (0 and (o' are inequivalent (in that Km and K^' are inequivalent), then 

x{( 0 ,(o')=Q. (C.632) 
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Proof. We first show that (C.630) yields (C.631) and (C.632). In the first case, 

||a) —cn'll = sup{|a)(a) — a)'(a)|,a G A,|la|| = 1} 

= sup{|Tr((ei2<„-er2^,)7rffl(a))|,aGA,||a|| = 1} 

= sup{|Tr((ei2<„-er2^,)fl)|,aG 7r(o(A),||a|| = 1} 

= sup{|Tr((ei2<„-er2^,)fl)|,aGB(i/ffl),||a|| = 1} 

= \\en^-enju (C.633) 

where || • || i is the trace norm on B\ {Hof). In the fifth step we used the fact that the 
map a I—> Tr {ba) is (7-weakIy continuous for any bGBi [Hof], so that we may replace 
the supremum over a G 7tio{A) by the supremum over a in the O’-weak closure of 
7tio{A) which by the Theorem C.130 is TtmiA)”, which in turn is B{H(if) because 
7tio{A) is irreducible (since (O is pure, cf. Theorem C.90). The last step then follows 
from Theorem B.146. To compute the last expression in (C.633), we assume that 
Qo> ^(o' not proportional (if they are, then (O = (o\ so that (C.630) reduces 
to 1 = 1, and hence holds). We may then work in the 2-dimensional Hilbert space 
spanned by Qa = (1,0) and Qoj' = (ci , 02 ), with |ci p -I- |c 2 p = 1. In that case, 

{en^-enj^ = \c2\^-h-, (C.634) 

\ena,-en^\ = ^J{ena>-en^,Y = \c2\A2\ (C.635) 

\\eao,-enJ\\ = Trde^j^|) = 2|c2|. (C.636) 

Using (C.633), this gives 

\-\\\(0-0)’Y = \-\\\eno,-eaj\=\-\c2? = \ci\^ = mo,M\Y (C.637) 

To deal with the second case, we use the following version of Schur’s Lemma: 

Lemma C.178. Let Um and Tr^,/ be irreducible representations of some C*-algebra 
A, and let w : Ho, —> Ho,' be an intertwiner, i.e., a bounded linear map that satisfies 

w%o,{a) = %o,i{a)w {a G A). (C.638) 

• If%a> and Ko,' are equivalent, then w is either zero or invertible. 

• Tto,' are inequivalent, then w is zero. 

Proof. The proof is the same as for group representations: taking the adjoint of 
(C.638), it follows that w*w G 7io,{A)' and ww* G TtoifA)', so by Theorem C.90 (i.e. 
the mother of all Schur’s lemma’s) we have w*w = X ■ 1//^ and ww* = p. ■ 1//^/, 
for some X,p G (since w*w and ww* are positive operators). Moreover, since 
w = Xww*w = pw, in fact we have X = p whenever w ^ 0. If A > 0, then the 
operator : Ho, —^ Ho,' is a unitary intertwiner, so Tto, and 7to,i are equivalent. 

If A = 0, then w*w = 0 and hence w = 0, since ||w*w|| = ||w|p. □ 


“Pu^tJC. T^txLtSiXLMtXLtjJCjaJ. T^lLy-A-LC-A. 



C.25 Jordan algebras and (pure) state spaces of C*-algebras 767 

Continuing the proof of (C.632), we form the direct sum 

7t{A) = 7ta,iA)(B7tay>{Ay, (C.639) 

H = (C.640) 

The second case of Lemma C.178 then gives 

{na{A)®na'{A))' = TtcoiA)'® Ttio'(A)', (C.641) 

whose right-hand side consist of operators X ■ 1//^ 0 • 1h^, (A,/r S C), so that 

{7ta,{A)®7t^'{A)y' = 7r„(A)" ©7r„,(A)" ©£(//<«')• (C.642) 


Once again using Theorem C.130, a computation a la (C.633) therefore gives 

lloj-aj'll = sup{\Ti{{en^-en^,)a)\,aeB{H(o)®B{H^,),\\a\] = 1} 

= sup{\Ti{eQ^a) -Ti{en^,a')\,a G B(Ha,),a' G B{Hat,),\\a®a'\\ = 1} 
= sup{|Tr(ei3^fl)|,flGB(//(n),||a|| = 1} 

+ s\ip{\Tv{en^,a')\,a' G B{Hfa,),\\a'\\ = 1} 

= lkr2j|i + lki7„,||i = l + l=2, (C.643) 

since the trace may be computed in a basis of Ha, © Ha,' consisting of a basis of Ha, 
and a basis of and ||fl ©a'|| = max{||a||, ||a'||} for a G B{Ha,) and a' G B(Ha,>). 
Finally, we prove that (C.629) and (C.630) coincide. If (O and (o' are equivalent, 

T(a),a)') =inf{Tr(ei 3 ^;r(B(a)) \ aGA,Q<a< \A,^({ea^,na,{a)) = 1} (C.644) 

and, as in (C.633), Theorem C.130 allows us to replace the infimum over a G A by 
the one over a G B{Ha,). The claim then follows from Theorem 2.12 and eq. (C.631). 
Similarly, if (O and (o' are inequivalent, eq. (C.642) and Theorem C.130 give 

t((0,(o') =mf{Tv{eQ^a) \ a G B{Hca)®B{Ha,'),0 <a< = 1}, 

and notice that the infimum zero is reached by a = 0 • 1//^ © 1//^,. □ 

The final result of this appendix, then, is the “pure” counterpart of Theorem C.172: 

Theorem C.179. Let A and B be unital C*-algebras. There is a bijective corre¬ 
spondence f = (p* between Jordan isomorphisms (p : Asa ^sa ond bijections 
f : P{B) —> P(A) that preserve transition probabilities and are w*-uniformly contin¬ 
uous along with their inverse. In particular, (p : Asa ^sa is o Jordan automorphism 
of A iff (p* : P{A) -G P{A) has the properties just stated for f. 

The proof of this theorem is far more difficult than Theorem C.172, so we omit it. 

If A = C(2f) and B = C{Y ) are commutative, we obtain a variation on Corollary 
C.22 featuring uniform homeomorphisms. Also, we see from Wigner’s Theorem 
5.4.1 that for A = B(H) it is enough to consider normal pure states, in which case 
also the (uniform) continuity condition on / is superfluous. 
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Notes 

As already mentioned in the Introduction, the theory of operator algebras on Hilbert 
spaces was created by von Neumann, partly in collaboration with his assistant Mur¬ 
ray (von Neumann, 1930, 1931, 1938, 1940, 1949; Murray & von Neumann, 1936, 
1937, 1943, reprinted in von Neumann, 1961). His motivation for doing so certainly 
included quantum mechanics, but also functional analysis, measure theory, ergodic 
theory, and representation theory, all of which fields in turn benefited from their 
interaction with operator algebras. Von Neumann (and Murray) studied what they 
called “rings of operators”, which are now deservedly called von Neumann alge¬ 
bras. John von Neumann (1903-1957) was one of the greatest mathematicians in 
history, especially considering the totality of his oeuvre in pure and applied mathe¬ 
matics (including numerical mathematics, computer science, and mathematical eco¬ 
nomics). His work in mathematical physics, notably on the mathematical structure 
of quantum mechanics, in some sense forms a bridge between the two. 

Von Neumann was a Hungarian prodigy; he wrote his first mathematical paper at 
the age of seventeen. Except for this first paper, his early work was in set theory and 
the foundations of mathematics. In the Fall of 1926, he moved to Gottingen to work 
with Hilbert. Around 1920, Hilbert had initiated his Beweistheory, an approach to 
the foundations of mathematics whose specific technical goals were not achieved 
because of Godel’s work, but whose overall view of mathematics (i.e. as an activity 
whose correctness is to be established purely syntactically and whose meaning is a 
semantic matter to be distinguished from its syntax) still reigns. However, at the time 
that von Neumann arrived, Hilbert was also interested in quantum mechanics. Apart 
from his broad interest in general (mathematical) physics (for example, his Sixth 
Problem from 1900 called for the mathematical axiomatization of physics), Hilbert 
was specifically attracted to quantum mechanics because Gottingen was, next to 
Copenhagen, a leading center for research in this area. Indeed, Heisenberg’s (1925) 
paper initiating quantum mechanics (at least in its preliminary guise of “matrix me¬ 
chanics”) was followed by the Dreimannerarbeit of Born, Heisenberg, and Jordan 
(1926), and all three were in Gottingen at the time. Bom was one of the few physi¬ 
cists of his day to be familiar with the concept of a matrix; in previous research he 
had even used infinite matrices. Born turned to his former teacher Hilbert for math¬ 
ematical advice. Aided by his assistants Nordheim and von Neumann, Hilbert thus 
ran a seminar on the mathematical structure of quantum mechanics, and the three 
wrote a joint paper on the subject (which is now exclusively of historical value). 

It was von Neumann (1927ab) who, at the age of 23, discovered the mathematical 
stmcture of quantum mechanics. In this process, he defined the abstract concept of 
a Hilbert space, which previously had only appeared in examples that went back to 
the work of Hilbert and his pupils on integral equations, spectral theory, and infinite¬ 
dimensional quadratic forms. Hilbert’s famous memoirs on integral equations had 
appeared between 1904 and 1906; in 1908, his student Schmidt had defined the 
space f' in the modern sense, and F. Riesz had studied the space of all continuous 
linear maps on in 1912. Various examples of L^-spaces had emerged around the 
same time (with hindsight, Hilbert himself mainly worked with the unit ball of 
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However, the abstract notion of a Hilbert space was missing until von Neumann 
provided it. In particular, von Neumann saw that Schrodinger’s wave functions were 
unit vectors in a Hilbert space of type, and that Heisenberg’s observables were 
linear operators on a different Hilbert space, of f' type. A unitary transformation 
between these spaces provided the the mathematical equivalence between wave me¬ 
chanics and matrix mechanics. Moreover, von Neumann developed the spectral the¬ 
ory of bounded as well as unbounded normal operators on a Hilbert space. This work 
culminated in his hook Mathematische Grundlagen der Quantenmechanik (1932). 

Despite the tremendous prestige of von Neumann, initially few mathematicians 
recognized the importance of his subsequent theory of operator algebras. For exam¬ 
ple, after a lecture by von Neumann on operator algebras in the weekly mathematics 
colloquium at Harvard sometime in the 1930s, G. H. Hardy, one of the leading math¬ 
ematicians of his time, is reported to have said:' 

“He is quite clearly a brilliant man, but why does he waste his time on this stuff?” 

Fortunately, among those who did study operator algebras were Gelfand & Naimark 
(1943), who linked the subject to Gelfand’s earlier work on (commutative) Banach 
algebras and in doing so created the theory of C*-algebras. This, in turn, was picked 
up by Segal (1947ab), who thereby also restored the link with quantum theory. 

A survey of von Neumann’s mathematical work is given in Oxtoby et al (1958), 
which contains a biographical introduction by von Neumann’s friend and colleague 
Ulam, and some of von Neumann’s correspondence is collected in Redei (2005b), 
which also contains a short mathematical biography. One of the most insightful 
documents about von Neumann is the rare manuscript Vonneumann (1987) by his 
brother Nicholas, of which the author got a copy from von Neumann’s only PhD stu¬ 
dent Israel Halperin, who visited Cambridge on a peace mission in the early 1990s.^ 
Politically, von Neumann was a controversial figure because of his enthusiastic con¬ 
tributions to nuclear weapons and the arms race between the USA and the Soviet 
Union; see Heims (1980) and Macrae (1992) for different perspectives on this. A 
substantial scholarly scientific biography of von Neumann remains to be written. 

The history of operator algebras (i.e. von Neumann algebras and C*-algebras, 
which terms were probably introduced by Dieudonne and Segal, respectively) has 
been described in Kadison (1982), Doran &. Belfi (1986), and Doran (1994). 

Leading textbooks on operator algebras, written by some of the original contrib¬ 
utors, are Neumark (1968), Sakai (1971), Dixmier (1977, 1981), Pedersen (1979), 
Kadison &. Ringrose (1983, 1986), and Takesaki (2002, 2003a, 2003b). See also 
Murphy (1990), Li (1992), Davidson (1996), Blackadar (2006), and the remarkable 
lectures on von Neumann algebras by algebraic topologist Lurie (2011). Connes 
(1994), written by arguably the greatest contemporary mathematician working in 
operator algebras, also provides innumerable fascinating insights into the subject. 

* Reported by G.D. Birkhoff (who overheard Hardy saying this) to his son, Garrett Birkhoff, who 
in turn mentioned it to G.C. Rota, who wrote it down in the Introduction to Stern (1991). 

^ According to Rhodes (1996, pp. 245-246), Halperin was a spy for the Soviet Union, although his 
evidence seems limited to the fact Halperin was arrested in 1946 suspected of espionage, having 
Klaus Fuchs in his address book. 
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§C.l.Basic definitions and examples 

As in the notes to the previous appendix, we only comment on results whose 
origins are less well known or which are less standard by themselves, the rest be¬ 
longing to the foundations of the field as described in the textbooks just mentioned. 
Once again, for this reason not all sections in this appendix come with notes. 

§C.2. Gelfand isomorphism 

The implication co G •S(A) ^ (o{a) G (j(a) (a G A) in the proof of Lemma C.9 
also holds in the oppositie direction (given that A is a Banach algebra with unit and 
(0 :A —T^Cis linear); this is the Gleason-Kahane-Zelazko Theorem (Sourour, 1994). 
A recent monograph about C(X) is Groenewegen & van Rooij (2016), following up 
on earlier books like Semadeni (1971) and Gillman (& Jerison (1976). 

§C.3.Gelfand duality 

Proposition C.19 is due to Gelfand & Kolmogorov (1939). In the spirit of the 
proof of the Stone-Weierstrass Theorem B.51 in §B.10, let us give an alternative 
proof of this proposition (Simon, 2011), which is based on Proposition C.14 and 
Corollary B.17. These identify Z{C{X)) with the set deM^{X) of extreme com¬ 
pletely regular probability measures on X, provided we identify the latter with the 
corresponding functionals on C{X), as in (B.39). That is, we must prove that the 
map xi-G- 5x (i.e., the Dirac measure at x, which, seen as a functional on C(2f), is just 
the evaluation map ev^:) is a bijection. 

Proof. We first show that a measure /r G deMj^(X) must satisfy ft (A) = 1 or 
fi(A) =0 for any A G X. For if there is some C G L for which 0 < p{C) < 1, 
we have a nontrivial convex decomposition jj. = t jJ-i + {1 — t )pL 2 , namely f = /r(C), 
jLIi(A) = jLi(A|C) (i.e., /r(AnC)/jLi(C)), and pL 2 {A) = pi{A\C)/ii(X\C). From this, 
we show that supp(/r) is a point. Indeed, if both x and y x would lie in supp(/r), 
we could separate these with disjoint open sets x G U and y GV. This would leave 
four (im)possibilities: 

• /r(t/)=/z(y) = l would imply /r(2f) > 2, contradicting = 1; 

• = 0 would make U‘^ fl supp(ju) a proper closed subset of supp(/r) whose 
open complement has measure zero, contradicting the definition of supp(/r); this 
applies to all four cases p{V) = 0, p{V) = 1, p{U) = 1, and jLl(y) = 0. 

Thus supp(ju) = {x} for some xGX, i.e., /r = 5x, so that dgM^ (2f) QX. Finally, we 
also have X C (2f), since 5^ = ffli + {l—t )^2 forces 

supp (fl 1 ) = supp (fi 2 ) = {x}, (C .645) 

and hence fii = fi 2 = Sx- □ 

In the unital/compact case, categorical Gelfand duality was first established in 
Negrepontis (1969, 1971), and was reproved in a different way by Johnstone (1982). 
Our proof of is taken from Landsman (2004), with some improvements in the non- 
unital case due to Brandenburg (2015), but it should be considered “folklore”. 

In the smooth case. Corollary C.22 is often called Milnor’s exercise. The result 
even holds without the second countability assumption on the manifold X, but with 
a completely different proof (Mrcun, 2005). See also Burtscher (2009). 
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§C.6. C*-algebras without unit: commutative case 

For proper maps see e.g. Bourbaki (1989), §1.10. 

§C.10. Hilbert C*-modules and multiplier algebras 

The theory of Hilbert C*-modules goes back to Kaplansky, Paschke, and Rieffel. 
See Lance (1995) and Raeburn & Williams for textbook coverage, and Landsman 
(1998a) for applications to mathematical physics (e.g. constrained quantization). 
Theorem C.76 is due to An Huef, Raeburn, & Williams (2010). 

The Cohen-Hewitt Factorization Theorem a la Fell & Doran (1988), Theorem 
V.9.2, adapted to C*-algebras, states that if A and B are C*-algebras and a B) 
is a homomorphism, then {a{a)b \ a G A,b G B} is a closed linear subspace of B. 
Consequently, if a is nondegenerate, then each element c G B factors as c = a{a)b. 
In particular, taking B — A and a to be the identity, we see that Lemma C.47 may 
be sharpened to the claim that any c G A takes the form c = ab for suitable a,b GA. 

§C. 11. Gelfand topology as a frame 

Our treatment of frames and locales has been borrowed from Mac Lane & Mo- 
erdijk (1992), where also the details of the proof of Theorem C.80 may be found. 
See also Picado & Pultr (2012). Hereditary subalgebras are discussed e.g. in Peder¬ 
sen (1979) and Blackadar (2006). 

The fact that H{A) forms a complete lattice was noted by Akemann & Bice 
(2014), who also pursued the analogy with open sets, though not in a frame-theoretic 
setting. The theory is still disappointing in various ways, most notably in the fact 
that H{A) fails to be a frame unless A is commutative. Also, Theorem C.86 has (so 
far) been proved by conventional means, i.e., via the Gelfand isomorphism; it would 
be preferable to prove it purely algebraically (and if possible constructively). 

From a localic point of view, the Gelfand transform a : E{A) ^ C of a G A should 
primarily be described as the corresponding frame map : ^(C) —>■ ri'(E(A)), and 

hence, using Corollary C.84, as a frame map 

a-^ :ff{C)^H{A). (C.646) 

Denoting the hereditary subalgebra generated by a by Ha, i.e., the closure of a A, 
for U G ff(C)) we obtain a nice formula whose use remains to be established; 

a-\u)= n Ha-,. (C.647) 

zeC\U 

A direct proof of the last claim of Proposition C.82 uses the property H{A) = I(A) 
(in the commutative case), the identification of /AT with (IJ)^ (i.e., the closure of 
the linear span of all ab, a G I, b G J, which follows by taking an approximate unit 
in I or J), and the identification of V 5 with the closure of the linear span of |J S. 

§C.13. Tensor products of Hilbert spaces and C*-algebras 

For the proof of (C.248) see Reed & Simon (1972), Theorem 11.10. 

For tensor products of C*-algebra we mainly relied on Lance (1982), Li (1992), 
Wegge-Olsen (1993), and Takesaki (2002), by one of the founders of the theory. 
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Tensor products of Banach spaces and Hilbert spaces were first studied by Schatten 
(1946) and Schatten & von Neumann (1946, 1948). The subject was subsequently 
taken up by Grothendieck (1955) for locally convex spaces, and hence involves two 
of the greatest mathematicians of the twentieth century. Nuclearity of C*-algebras 
is a vast and important field, to which Takesaki (2003) is a good introduction. 

Yet another expression for the maximal C*-norm on A 0 B arises if we say that 
two representations Ka'-A^ B{H) and Kb'-B ^ B(H) on the same Hilbert space H 
commute if nA{a)nB{b) = nB{b)nA{a) for all a G A and b G B. Such a pair defines a 
representation tta 0 TTb of A 0 B by 


TtA 0 %(c) =Y,ttA{ai) 0 TtBibi), (C.648) 

i 

which makes sense because {a,b) i-G n{a)n{b) is bilinear and hence (by universality 
of 0) factors through A®B. This gives a third formula for 11 • 11max, namely 

II ^11 max — sup{ II tta 0 nB{c)\\B(HAmB)}^ (C.649) 

where Ka and % run through all commuting representations of A and B. Indeed, 
the restrictions of any representation of A 0 B to A and B define commuting repre¬ 
sentations, so that although at first sight the expression (C.649) appears to majorize 
(C.265), it must be equal to it in view of the equality of (C.265) and (C.263). 

The name projective tensor product for A^maxB, where A and B are C*-algebras, 
is actually confusing, since if A and B are regarded as Banach algebras, their pro¬ 
jective tensor product is usually defined as the completion of A C)B in the norm 


Ikllproj =inf-i 


— 


= Y^ai®bi 


(C.650) 


cf. (C.259), which is defined for any two Banach algebras A and B. This may not be 
a C*-norm, and hence A(§)projB may not be a C*-algebra. However, for any Banach 
algebra C with involution, one may canonically construct a C*-algebra C* (sic) 
and a homomorphism <p \ C ^ C* of involutive Banach algebras, with the universal 
property that for any morphism : C -G D, where D is a C*-algebra, there is a 
unique homomorphism j3' : C* —!► Z) of C*-algebras such that j3 ^ p' o<p. This C*- 
algebra C*, which by the usual argument is unique up to isomorphism, is called the 
C*-envelope of C. An explicit construction is obtained by completing C in the norm 


||c|| =sup{||;r(c)||}, 


(C.651) 


where the supremum runs over all representations of C on Hilbert spaces; it is fi¬ 
nite since ||7r(c)|| < ||c|| for each c G C, see Dixmier (1977), §1.3.7 and §2.7. It is 
easy to see that || • ||pjoj is a cross-norm on A (g)B, and that one has a bijective corre¬ 
spondence between representations of A(g)B that satisfy ||;r(fl(8)Z»)|| < ||fl||||Z>|| and 
representations of A(§)pjojB. The point, then, is that one has A^^^^B = (A(§)pjojB)*. 
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C*-algebras (with homomorphism) and (8)max form a monoidal category (also 
called a tensor category), with commutative C*-algebras as a full subcategory CCA. 
The map X i—Co(X) then defines a duality as monoidal categories between the cat¬ 
egory LC H p of locally compact Hausdorff spaces and proper continuous maps (with 
cartesian product as a tensor product) and the category CCAn of commutative C*- 
algebras and nondegenerate homomorphisms (with its unique C*-algebraic tensor 
product, for example realized as (§)max)- Cf. Theorem C.45. See Hofmann (1970). 

§C.14. Inductive limits and infinite tensor products of C*-algebras 

For inductive limits of C*-algebras see in particular Sakai (1971); they were orig¬ 
inally a Japanese invention (Takeda). Infinite tensor products of operator algebras 
(which partly motivated inductive limits) go back to von Neumann (1938). Bounded 
monotone nets converge under very general conditions; see McArthur (1970). 

§C.15. Gelfand isomorphism and Fourier theory 

For details on the Haar measure and for the proof of local compactness of G see 
Weil (1965), §27. Our approach to the Fourier transform is largely taken from Deit- 
mar & Echterhoff (2009), where complete proofs may be found (though we some¬ 
times followed a slightly different approach). In particular, these authors introduced 
the Banach spaces Cq (G) and Cq (G), whose use forms a marked improvement over 
older and less elegant treatments, as in e.g. Rudin (1962) or Folland (1995). 

Eq. (C.379) is often called Plancherel’s Theorem. 

We may add a third entry to the ‘symmetric’ isomorphisms (C.379) - (C.380). 
The Bruhat space (G) of rapidly decreasing functions on G is defined by 

A(G) = {/ € L~(G) I G X{G)'in > 03C„ > OVk > 0 : ||/|G\/r^l|oo < C„k-”}; 
y{G) = {/ G L~(G) I / G A(G),/ G A(G)}. 

Eor G = K this recovers the usual test functions (cf. Definition 5.64), where 

the condition / G A(R) gives rapid decrease whereas / G A(R) gives smoothness. 
Pontryagin duality then yields an isomorphism ^(G) = S^iG) (Osborne, 1975). 

The author originally learnt the SNAG-Theorem from Barut & Ragka (1977), 
whose proof (due to K. Maurin) is quite different; the argument given above was 
inspired by the treatment of projection-valued spectral measures in Conway (2007, 
Ch. 9, §1), who calls them spectral measures. Conway also proves our Theorem 
C.113 as his Theorem 1.14, albeit for the case where X is compact; passage to the 
locally compact case may be done through unitization, as in §C.6. The need for n to 
be non-degenerate may then be traced back to (our) Lemma C.43. 

§C.16. Intermezzo: Lie groupoids 

Eor introductions to Lie groupoids see Moerdijk & Mrcun (2003) or Mackenzie 
(2005), who also described the link with symplectic geometry. Eor their use in non- 
commutative geometry and mathematical physics cf. Connes (1994) and Landsman 
(1998a, 2006b), respectively. The tangent groupoid was invented by Connes, with 
further contributions by Hilsum & Skandalis (1987), Weinstein (1989) and Lands¬ 
man (1998a). See also Connes (1994), Landsman (2003), Higson (2010), and van 
Lip (2010) for applications of the tangent groupoid to index theory. 
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§C.17. C*-algebras associated to Lie groupoids 

C*-algebras associated to locally compact groupoids (with Haar system) were 
first studied in detail by Renault (1980). Originally in the setting of foliation theory, 
the Lie (i.e. smooth) case was pioneered by Connes (1994), who noted in particular 
that Lie groupoids carry an intrinsic Haar system, and gave many interesting exam¬ 
ples. The uniqueness of C*(G) for Lie groupoids G, i.e., the independence of the 
underlying left Haar system (up to isomorphism) is proved in Paterson (1999). 

§C.18. Group C*-algebras and crossed product algebras 

The locus classicus is Pedersen (1979), but Williams (2007) may even be better. 

§C.19. Continuous bundles of C*-algebras 

The bundles studied in this section were originally introduced by Fell (1961) and 
their theory was further developed by Dixmier & Douady (1963); see also Dixmier 
(1977), Fell & Doran (1988), and, for a modern treatment, Raeburn & Williams 
(1998). Lemma C.125 was part of Dixmier’s definition of a continuous field of C*- 
algebras, before it was recast into the rather more appealing Definition C.121 by 
Kirchberg & Wassermann (1995) and Blanchard (1996). Theorem C.123 is due to 
Landsman & Ramazan (2001); see also Landsman (1998a) for a detailed discussion. 
Aastrup, Nest, & Schrohe (2006) discuss applications to manifolds with boundary. 

§C.20. von Neumann algebras and the ff-weak topology 

There are many other topologies on von Neumann algebras, se e.g. Takesaki 
(2002), Chapter II. In any case, we only scratch the surface of the subject. 

§C.2LProjections in von Neumann algebras 

The first part of the proof of Theorem C.141 is taken from Redei (1998), Prop. 
4.16. The remainder is adapted from Heunen, Landsman, & Spitters (2012). The de¬ 
tails of the proof of Theorem C. 140 may be found in Takesaki (2002), Thm. III. 1.18; 
see also Dixmier (1981), Ch. 7 and Lurie (2011), lectures 13-17. 

§C.23. Classification of hyperfinite factors 

This material, which is a high point in modern mathematics, is explained in great 
detail in Takesaki (2003ab). See also Wright (1989) for the uniqueness of the hy¬ 
perfinite nil factor. In his review MR1030046 (91a:46059) of the latter book for 
Mathematical Reviews in 1991, E. Stprmer wrote: 

‘At the time of writing this review, by far the deepest and most difficult proof in von Neu¬ 
mann algebra theory is the one of Connes and Haagerup on the uniqueness of the injective 
factor of type IIIi with separable predual.’ 

The applications of C*-algebras and von Neumann algebras to quantum field the¬ 
ory are reviewed in Haag (1992), where the identification of the unique hyperfi¬ 
nite nil factor with local algebras of observables may be found in §V.6. This book 
also explains the relationship between Tomita-Takesaki theory and quantum sta¬ 
tistical mechanics, as do Bratteli & Robinson (1981). It should be mentioned that 
the Tomita-Takesaki theory, including the modular group (i.e. of time translations) 
has a classical analogue in Poisson geometry (Weinstein, 1997), which somewhat 
softens the spectacular claim by Connes & Rovelli (1994) that time has a quantum- 
mechanical (or non-commutative) origin related to thermodynamics. 
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§C.24. Other special classes of C*-algebras 

The classic reference on AW*-algebras and Rickart C*-algebras is Berberian 
(1972). For monotone complete C*-algebras see the monograph by Saito & Mait¬ 
land Wright (2015b). Real rank zero was introduced by Brown & Pedersen (1991), 
who also proved that the definition of real rank zero in the main text may be replaced 
by an equivalent property that is often taken as the definition: 

Proposition C.180. Let A be a unital C*-algebra. Then rr(A) = 0 iff the set of self- 
adjoint elements with finite spectrum is dense in Asa. 

See Davidson (1996), Theorem V.7.3, for a streamlined proof. 

Scattered C*-algebras were independently introduced by Jensen (1977) and Hu- 
ruya (1978). The results in the main text are due to Kusuda (2011). 

Theorem C.167.1 should be obvious. No. 2 is due to Kusuda (2011), no. 3 may 
be found in Takesaki (2002), §III.l, no. 4 is (a restatement of) Theorem 2.3.7 in 
Saito & Maitland Wright (2015b), no. 5 is Theorem 1.7.1 in Berberian (1972), no. 
6 is Theorem 1.8.1 in the same reference, no. 7 is from Saito & Maitland Wright 
(2015a), and finally no. 8 may be found in Blackadar (1994), §6.1.3. 

Theorem C.169.1 is Exercise 4.6.12 in Kadison & Ringrose (1983); it should 
be hidden from students that the AMS published two volumes with the answers to 
all their exercises! No. 2 is in Kusuda (2011), no. 3 is in Pedersen (1972), no. 4 
is (a restatement of) Theorem 8.2.5 in Saito & Maitland Wright (2015b), and no. 
5 easily follows from Corollary 2.7 in Saito & Maitland Wright (2015a). See also 
Lindenhovius (2016), where results of this kind are used to study the invariant (A). 

§C.25. Jordan algebras and (pure) state spaces of C*-algebras 

Theorem C.172 is Corollary 4.20 in Alfsen & Shultz (2001), based on Kadison 
(1951). See also Roberts & Roepstorff (1969). Theorem C.174 is due to Hamhalter 
(2015); the second step in the proof had been given earlier by Heunen & Reyes 
(2014). A complete proof of Lemma C.173 may be found in Bratteli & Robinson 
(1997), Theorem 3.2.3. In particular, Kadison’s inequality is Proposition 3.2.4 in the 
same book. Theorem C.175 is the culmination of a long chain of argument, starting 
with Jacobson & Rickart (1950) and ending with Thomsen (1982). See also Bratteli 
&. Robinson (1987), Theorem 3.2.3. 

The formula (C.629) was proposed by Mielnik (1968, 1969). Otherwise, case 1 
of Proposition C.177 is due to Roberts & Roepstorff (1969), who state case 2 with¬ 
out proof, referring to Glimm & Kadison (1960). Theorem C.179 is due to Shultz 
(1982). A completely different proof of the last claim, based on a reconstruction of 
A from P{A), appears in Landsman (1998a), §1.3. Both authors add further structure 
to P{A) to make it an invariant for A as a C*-algebra, viz. an orientation and a Pois¬ 
son structure, respectively. The notion of an orientation was originally introduced 
by Alfsen & Shultz in order to make S{A) a complete invariant for A; see their final 
work Alfsen & Shultz (2001, 2003). 
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Lattices and logic 


In this appendix we collect some basic material from the theory of lattices, includ¬ 
ing Stone’s representation theorem for Boolean lattices and the connection between 
Boolean (Heyting) lattices and classical (intuitionistic) propositional logic. In prepa¬ 
ration for Appendix E, we also provide an introduction to first-order logic. 


D.l Order theory and lattices 

One hopes that the reader has seen some of the following concepts before! 

Definition D.l. 1. A preorder on a set X is a subset R dX xX (i.e., a relation on 
X), where we write x < y or y > x iff {x,y) G R, such that x < x, and x <y and 
y < z imply x < z. A preorder is a partial order if in addition x <y and y < x 
imply X = y. A set with a partial order is called a poset (for partially ordered 
setj. A a poset (or preorder) is directed if every pair {x,y} has an upper bound, 
i.e., some zfor which x < z and y <z. A poset may have a largest element (also 
called a top element j denoted by \ or T that satisfies x < T for each x G X, 
and/or a smallest element (also called a bottom element j 0 or _L that satisfies 
_L < X for each xGX. For x,zGX, the order interval [x, z] is defined by 

[x,z] = {y\x<y<z}. (D.l) 

An atom in a poset with 0 is an element x Ofor which [0,x] = {0,x}. In other 
words, X is an atom ifxfO, and 0<y< X implies y = 0 or y = x. Thus x is an 
atom iff X covers 0, where we say that x covers y if x y and [y,x] = {y,x}. 

A homomorphism between posets is a map that preserves <. As usual, an iso¬ 
morphism is an invertible (i.e. bijective) homomorphism, such that the inverse 
also preserves the given structure (which, in this case, is <). 

Thus a bijection <p : X ^ Y between posets X and Y is an isomorphism when 
(p{x) < (p(y) iff X < y). In some cases, the inverse of a bijective homomorphism 
automatically preserves the relevant structure. 

© The Author(s) 2017 777 
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2. A lattice is a poset in which for any two elements x,y, there exists: 

• an element xWy, called the a supremum (sup ) ofx andy, such that 

x<xVy; (D-2) 

y<x\/y, (D-3) 

and if X < z and y < z for some z, then xVy <z; 

• an element x Ay, called the inflmum finfj ofx and y, such that 

x>xAy; 
y>x Ay, 

and if X > z and y>z for some z, then x Ay >z. 

Suprema and infima are unique (if they exist). Equivalently, a lattice may be de¬ 
fined algebraically (rather than order-theoretically) as a set equipped with two 
idempotent, commutative, and associative binary operations V, A that satisfy 

xV(yAx)=x; (D.6) 

xA(yVx)=x. (D.7) 

The corresponding partial ordering is then defined by x < y if x Ay = x. 

3. A homomorphism between lattices is a map that preserves V and A. 

In this case, an isomorphism of lattices may be defined as a bijective homomor¬ 
phism, which automatically preserves V and A, (and similarly in all other cases 
below). One may also consider order homomorphisms between lattices just re¬ 
garded as posets. This is a weaker notion: a lattice homomorphism is an order 
homomorphism, but not necessarily the other way round. However, an order iso¬ 
morphism between lattices turns out to be the same as a lattice isomorphism. 

4. A complete lattice is a poset X in which each subset S ofX has a supremum V S 
(i.e., X <\/ S for each x € S, and x < z for each x € S implies \lS < z), as well 
as an infimum /\S (i.e., x > l\S for each x € S, and x > zfor each x € S implies 
y S > z). Clearly, taking S finite makes a complete lattice (merely) a lattice. A 
complete lattice X has a largest element Q = yx and a smallest element I = f\X. 

5. A lattice is distrihutive if either one (and hence both) of the following equivalent 
properties holds: 


(D.4) 

(D.5) 


xV (yAz) = (xVy) A (xVz); (D.8) 

xA (yVz) = (xAy) V (xAz). (D.9) 

6. A frame is a complete lattice X which is “infinitely distributive ” in that 

xA\/S = \/{xAy,y&S}, (D.IO) 

for arbitrary subsets S C X. A frame is clearly distributive. Frame homomor¬ 
phism by definition preserve finite infima and arbitrary suprema. 
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7. A Heyting algebra is a lattice X with top T and bottom _L, equipped with a map 
—-^-.X xX -^X, called (material) implication that satisfies 

x<iy—^z)iff{xAy)<z. (D-H) 

A Heyting algebra is automatically distributive. Negation is defined by 

^x=(x—^_L). (D.12) 

A Heyting algebra is complete when it is complete as a lattice, in that arbitrary 
suprema (and hence also infima) exist. In that case, (D.IO) is satisfied, so that a 
complete Heyting algebra is a frame. Conversely, a frame becomes a complete 
Heyting algebra if we define the implication arrow —•» by 

y-^z = \/{xeX \xAy<z}. (D.13) 

However, frames and complete Heyting algebras drift apart as soon as morphisms 
are concerned, for although in both cases one requires maps to preserve the partial 
order, maps between Heyting algebras must preserve —rather than V- 

8. An orthocomplementation on a lattice (poset) X with 0 and 1 is a map 

±:X^X, x^x^, (D.14) 


that satisfies: 


x<y iffy^<x^-, 

xAx^ =0 {xAx^ exists and equals 0); 
xVx^ = I exists and equals 1). 


(D.15) 
(D.16) 
(D.17) 
(D.l 8) 


A lattice (poset) with an orthocomplementation is called orthocomplemented. 
A homomorphism of orthocomplemented lattices (posets) is an lattice (order) 
morphism that also preserves the orthocomplementation, as well as 0 or 1. 

9. A lattice is called modular ifx < z implies xV {yAz) = (xVy) A zfor each y (i.e., 
if distributivity holds merely ifx < z). 

Hence modularity is a weakening of the following property: 

10. A distributive orthocomplemented lattice is called Boolean. A homomorphism 
between Boolean lattices is just a homomorphism of orthocomplemented lattices. 
An isomophism of Boolean lattices is a map that preserves V, A, and _L, i.e., an 
invertible homomophism. 

11. An orthocomplemented lattice (poset) is called orthomodular if x < z implies 
(that xM z^ and hence x^ A z exist and that) 

xy(x^Az)=z. (D.19) 
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That is, the modularity axiom holds for y = (note that xV {x^ Az) = z exists 
because x < xVz^). For lattices this axiom is equivalent to each of: 

• X <z andx^ Az = 0 imply x = z. 

• xCy iffyCx, where xCy if x = {x Ay)\/ {xAy^) (i.e., x and y are compatible). 
In a Boolean lattice any two elements are compatible, reconfirming the fact 
that orthomodularity is a weakening of modularity and hence of Booleanity. 

A homomorphism between orthomodular lattices (posets) is a map (p that 
preserves 0 and _L (and hence preserves 1), and satisfies (p{x\/ y) = (p{x)\/ 
(p{y) (if X < An isomorphism between orthomodular lattices (posets) is 
an invertible homomorphism, which is automatically an order isomorphism. 

Every Boolean algebra is a Heyting algebra, but not vice versa', a Heyting algebra is 
Boolean iff one and hence both of the following equivalent conditions hold: 

-'-'X = x (xGX); (D.20) 

(^x)Vx = T (xGX), (D.21) 

which state the law of the excluded middle (famously denied by Brouwer). 

The following result will be used implicitly throughout the main text. 

Proposition D.2. An order isomorphism of a lattice preserves all suprema and in- 
fima that exist. Hence in a complete lattice all suprema and infima are preserved. 

An important source of orthocomplemented lattices is provided by (possibly 
infinite-dimensional) complex vector spaces V with inner product, cf. Definition 
A.l: the elements of X are the orthoclosed subspaces L CV, i.e., those subspaces 
for which = L, where and orthocomplementation is defined by 

L-L = {vGy |VwGL: (v,w) =0}, (D.22) 

and the partial ordering is given by inclusion. This yields 

LAM = LnM; (D.23) 

L\JM={L + M)^^ = {L^tMH^)^, (D.24) 

where L+M is the linear span of L and M. We have the Amemiya-Araki Theorem: 

Theorem D.3. The lattice of orthoclosed subspaces of an inner product space V is 
orthomodular ijfV is complete in the norm (A.2) associated to the inner product. 

A space X is called totally disconnected if it has no other connected subspaces 
than its points (so any larger subspace fXis the union of two proper clopen sets). 

Definition D.4. A Stone space is a totally disconnected compact Hausdorff space. 

Any finite set (with the discrete topology) is a Stone space. The best-known example 
of an infinite Stone space is the Cantor set {0,1}^ with product topology, which in 
addition is metrizable and has no isolated points (these properties even characterize 
the Cantor set up to homeomorphism). Stone’s Representation Theorem reads: 
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Theorem D.5. A lattice L is Boolean ijfit is isomorphic to the lattice Clopen(X) of 
all clopen subsets of some Stone space X (partially ordered by set-theoretic inclu¬ 
sion), where X is uniquely determined by L up to homeomorphism. 

Thus the lattice operations in Clopen(X) are simply geven set-theoretically by 

uyw = U\JW-, (D.25) 

t/Ay = t/nw, (D.26) 

with orthocomplementation given by set-theoretic complementation (the theorem is 
obviously predicated on the fact that such a lattice is Boolean). The space X is called 
the Stone spectrum of L, generically denoted by S^{L). Just like Gelfand duality, 
Theorem D.5 extends to a categorical duality theorem in an obvious way. 

The Stone spectrum (L) of L has the following canonical realizations: 

1. Consider the space Pt(L) = Hom(L,2), where 2 = {0,1} is seen as a Boolean 
lattice ordered by 0 < 1 (and 0 f 1), with topology inherited from the product 
topology on 2^. That is, the basic opens in Pt(L) are the sets 

= {<P G Pt(T) I (pix) = 1}, (D.27) 

where x G L, and similarly with 1 0. This is a Stone space, with isomorphism 

L4ciopen(Pt(L)); (D.28) 

x^Ux. (D.29) 

2. Generalizing the case of a power set (cf. Definition B.49), a filter in a (Boolean) 
lattice L is a nonempty subset F C L such that x,y G F implies x Ay G F, and 
y > X G F implies y G F (whence 1 G T). A filter F is proper if F f L, which 
is the case iff 0 ^ F. An ultrafilter is a filter that is maximal in the set of all 
proper filters, ordered by inclusion. Ultrafilters (i.e. maximal filters) in a Boolean 
lattice are the same as prime filters, which are filters for which xVy G F implies 
X G F oiy G F. More generally, in a distributive lattice with 0 any maximal filter 
is prime, and the presence of an orthocomplementation also gives the converse 
inclusion. Moreover, a filter T in a Boolean lattice is maximal (and hence prime) 
iff for any x G L either x G F or x^ G F (but not both). For x G L, let 

U^ = {F Gf/{L)\xGF}, (D.30) 

where ^(L) is the set of all ultrafilters on L. One has U^DUy = as well 
ast/;ut/; = t/;v3„t/ict/;if X <y, and subsets (L) form the basis of a 

topology on fA (L) whose open sets are sets U' C‘^ (L) with the property that for 
each F GU' there is xGL with F GU^QU'. This topology makes (L) a Stone 
space, whose basis of clopen sets is given by the U'y., x G L, with isomorphism 
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L4ciopen('^(L)); (D.31) 

x^U'^. (D.32) 

3. Instead of filters, one may consider the dual notion of ideals, obtained by revers¬ 

ing the order (and hence swapping A and V). Thus an ideal in L is a subset /CL 
such that x,y £ I implies xVy £ I, and y<x£l implies y £ I (whence 0 £ I). An 
ideal I is proper if I which is the case iff 1 ^ /. A maximal ideal is an ideal 
that is maximal in the set of all proper ideals, ordered by inclusion. In a Boolean 
lattice, maximal ideals coincide with prime ideals, which are ideals I that do not 
contain 1, and where xAy £ I implies x£l or y£l. In a distributive lattice with 0 
any maximal ideal is prime. The (set-theoretic) complement of a maximal ideal 
is a maximal filter (i.e. an ultrafilter), so that an ideal / in a Boolean lattice is 
maximal (and hence prime) iff for any x £ L either x £ I or x^ £ I (but not both). 
The space J^{L) of all maximal (i.e. prime) ideals in L is topologized by basic 
opens U" = {I £ <^{L) | x ^ /}, and so this time the desired isomorphism is 

L4ciopen(j^(L)); (D.33) 

x^U”. (D.34) 

4. Finally, the set Idl(L) of all ideals in a (Boolean) lattice L is a frame if it is 
partially ordered by inclusion (cf. §C. 11). One may realize the points of the frame 
Idl(L) as its prime elements (cf. Lemma C.85), which are simply the prime (and 
hence maximal) ideals in L considered above. Hence Pt(Idl(L)) forms a model of 
the Stone spectrum X of L, too. The advantage of this realization is that it gives 
a direct description of the topology of X (seen as a frame), namely as 

^(X)^Idl(L). (D.35) 

The relationship between the first three approaches is that for any (p £ Pt(L), the 

set ^^^({1}) is a maximal filter in L, whose complement ^^*({0}) ^ maxi¬ 

mal ideal. This can be shown to give homeomorphisms Pt(L) = '^(L) = J^(L), 
under which the opens Ux, U'x, and U" are mapped to each other. The (contravari- 
ant) functorial nature of the Stone spectrum comes out particularly clearly in the 
first description; given a homomorphism h : L ^ L', we immediately obtain a map 
h* : Pt(L') -£ Pt(L) by pullback (i.e., h*(p = (p o h). In this description the iso¬ 
morphism X ^ Pt(Clopen(X)) is given by x (Px, where (Px{U) = l( 7 (x), with 
U £ Clopen(X). In the second description, the isomorphism X ^ '^(Clopen(X)) 
is given by x i —{U £ Clopen(X) | x £ [/}, which also gives the isomorphism 
X ^ (Clopen(X)) of the third description as x i—>■ {1/ £ Clopen(X) | x ^ U}. 

Eq. (D.35) follows from Theorem D.5, which implies an isomorphism of frames 

&{X) 4 Idl(Clopen(X)); (D.36) 

U ^ {V £ Clopen(X) \ V QU}, (D.37) 
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with inverse I However, by itself, eq. (D.35) may also be taken as a 

constructive version of Stone’s Representation Theorem; the next, non-constructive 
step (relying on Zorn’s Lemma) then gives the points of X from Idl(L), cf. §C.l 1. 

To close this brief introduction to lattice theory, we present a general construction 
of free distributive lattices, possibly with relations, which will be needed for the 
theory of the constructive Gelfand spectrum in §12.2. The main advantage of this 
construction is that it can be performed in any topos, as will indeed be done in §12.4. 

Definition D. 6 . The free distributive lattice on a set S is the set of irredundant 

finite subsets {Ai,... ,A„} of the finite power set TPf of S, i.e.. A,- C S, |A,| < oo, 
n € N, and no A,- is a proper subset of any Aj, with lattice operations inductively 
generated (using distributivity) from the following singleton cases: 

{W}v{W} = {M,{0}; (D-38) 

{W}A{W} = {{^T}}. (D.39) 

For {Ai,... ,A„} G Afg as above, and similarly {Bi,... ,B„} G Jfg, these rules imply 

{Ai,...,A„}V = {Ai,... ,A„,Bi,... ,Bm}iT, (D.40) 

{Ai,...,A„}A{Bi,...,Bm} = {AiUBj\i=l,...,n,j = l,...,m}ir, (D.41) 

where the subscript ir means that redundancies in the above sense have been re¬ 
moved by deleting any set on the list that properly contains some other set on the 
list. The motivation for this rule is that, using distributivity, any element x of a dis¬ 
tributive lattice can be brought into the (“normal”) form x = xi V • • • Vx„, where each 
Xi = A • • • is a finite meet. We then identify A,- with so 

that Xi = l\Ai, and identify {Ai,...,A„} with xi V • • • Vx«. If we allow empty sets 
(as we do), then has both a bottom element _L = V 0 and a top element T = /\ 0 . 

Consequently, an equivalent description of .jZy is to first define the set Z of all 
formal expressions inductively defined by the rules: (i) 5 C Z, _L G Z, and T G Z; 
(ii) ifx G Z andy G Z, thenxVy G Z andxAy G Z. Secondly, we quotient Z by the 
equivalence relation generated by all of the basic identities in a distributive lattice, 
i.e., the commutativity, associativity, idempotency, and distributivity laws for V and 
A, the rules x V _L = x and x A T = x, and the absorption law x V (x A y) = x. The 
lattice operations on the quotient are the ones inherited from concatenation on Z. 

As in most free constructions, the map S i—is left adjoint to the forgetful 
functor from the category of distributive lattices into Sets. One has a canonical map 
i: S ^ .jZs, given by i{s) = {s}, with the universal property that any function / : 
S L from S to some distributive lattice L factors through .ifj, i.e., there is a unique 
lattice homomorphism g : -A L such that f = goi. Indeed, g may be inductively 

generated from the special case g({{s}}) = f{s) using the rules (D.38) - (D.39). 

One may enrich this construction by introducing a congruence ~ on c-g-^ one 
generated by relations x, = y,, i G I. In that case, the ensuing quotient JZ 5 / ~ exists, 
and is universal for homomorphisms / : Zfs -A M of distributive lattices that satisfy 
/(x,) = /(y,), i.e., if p : ~ is the canonical projection, there is a unique 

homomorphisms of distributive lattices g : {^sj M such that f = go p. 
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D.2 Propositional logic 

The topos-theoretical approach to quantum logic discussed in Chapter 12 uses an ad¬ 
vanced version of an elementary construction in algebraic logic that relates classical 
propositional logic to Boolean algebras (or lattices), and similarly relates intuition- 
istic propositional logic to Heyting algebras. Tough easy to state, these relationships 
are conceptually quite deep, based as they are on a separation between syntax and 
semantics that is decidedly “modem”, reflecting a view on the nature of mathematics 
that would have been completely foreign to e.g. Newton and Euler or even Gauss, 
not to speak of Euclid and Archimedes, notwithstanding their use of the axiomatic- 
deductive method that has been a defining property of (real) mathematics since its 
birth in Plato’s Academy. As expressed by Boole himself, this modem view is: 

‘They who are acquainted with the present state of the theory of Symbolic Algebra, are 
aware that the validity of the processes of analysis does not depend upon the interpretation 
of the symbols which are employed, but solely upon the laws of their combination.’ 

(Boole, 1847, Preface) 

The formalization of mathematics starts with propositional logic, whose notation 
consists of the following groups of symbols in terms of which a theory is defined: 

1. Purely logical symbols A, V, and (which, because of the axioms they will 
be subject to, may later be interpreted as not, and, or, implies, respectively). 

2 . Non-logical symbols pi,P 2 ,--- (also written p,p' or p,q,...), which denote 
atomic propositions (being the simplest examples of propositions, see below). 
The set E = {pi,...} (at most countable) is called the signature of a theory. 

As in arithmetic, there is some ambiguity to be dispelled. This may be done either by 
introducing brackets (,), subject to obvious mles we omit, or by conventions to the 
effect that -■ “binds” symbols more strongly that V and A, which in turn “bind” more 
strongly than —>■. Eor example, V 5 —> j3 A yis the same as ((“■ct) V 5) —> (j3 A 7 ). 

In propositional logic (unlike in first-order logic), well-formed formulae and 
propositions coincide; typically denoted by Greek letters a,j3,..., both are defined 
as expressions in the above symbols that (iteratively) arise in the following way: 

i) Each non-logical symbol pi,p 2 ,. ■ ■ present in the signature E is a proposition. 

ii) If a and j3 are proposition, then soareaAj3,aVj3,a—>^j3, and -•a. 

Also here one may use brackets in the obvious way, e.g., if a is pi — p 2 , and j3 is 
Pi A p 3 , then (pi —>• P 2 ) —>■ (pi A p$) is the same as a —>■ j3. 

Eor example, one may check that the following expression is a valid proposition: 

{Pi {P 2 P 3 )) {{Pi P 2 ) {pi P3))- (D.42) 

A final informal symbol we use is =, as in a = j3, which has no logical meaning, 
but states that a is the same as j3 (e.g., for a = (pi —>■ (p 2 —>■ Pa)), consider -■«). 

The notion of a (propositional) theory will be picked up later, but we now inter¬ 
rupt the construction of the syntax of propositional logic and discuss its semantics. 
In its most elementary form, this means that there is a valuation on Z, i.e.. 
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y:Z^{0,l}, (D.43) 

also called a truth function, where 0 = false and 1 = true; one often writes a = I 
for V (a) = 1 (i.e., a is true, and a = 0 if a is false (this formally introduces a new 
symbol which however is foreign to propositional logic). Let Bz be the set of 
all propositions (i.e., well-formed formulae) on the given signature Z. With abuse 
of notation (justified by the property Z C Bz), V uniquely extends to a function 

y:Bx^{0,l}, (D.44) 

as follows. First, each V(pi) is fixed by the given function (D.43). Second, the value 
of y on compound expressions is (iteratively) determined through the use of truth 
tables, which formalize the everyday meaning of the symbols -i, A, V, — 



a 

-■a 

o' 

1 

1 

0 


The first table should be read as follows: if a is false, then -^a is true, and if a 
is true, than -la is false. Similarly, the second table means that if a and j3 are both 
false, then so is a A j3, etc. For example, to see if 7 = pi A {~^P 2 ) is true or false given 
the valuation pi = p 2 = 0 , we first look at the truth table for -■ with a = p 2 , inferring 
from the first row that -'P 2 = 1 als p 2 = 0. We subsequently inspect the table for A 
with a = Pi and j3 = -'P 2 - Since pi — p 2 = 0 is the same as = 0 and -'P 2 = 1, 
we look at the second row, obtaining 7=0. Another example, just involving the 
implication symbol —is (D.42), given e.g. pi = I, p 2 = 0, and p 3 = 1. This is 
settled through the following steps, each of which involves the table for — 

1. Taking a=p 2 = 0 and j3 = p 3 = 1, the second row gives {p 2 pf) = 1- 

2. Taking a = pi = \ and j3 = (p 2 —>■ Pi) = 1. row 4 yields [pi —>• {p 2 —>■ pf)) = 1. 

3. Similarly, (pi —>• p 2 ) = 0 and (pi —>• pfj = 1. 

4. From these, the second row gives ((pi —>■ p 2 ) —>■ (pi —>■ pi)) = 1. 

5. Finally, taking a = (pi (p 2 pi)) = 1 and j3 = ((pi p 2 ) (pi Pi)) = 1 
in the fourth row gives 

{pi {P2^ Pi)) {{pi Pi) {pi Pi)) = 1- (D-45) 

The proposition in (D.42) is actually rather special, in that all truth values for the 
atomic propositions (pi, p 2 , pi) it contains make it true (as is easily checked). 

Definition D.7. A proposition (p that is true whatever the (un)truth of the atomic 
propositions it contains, is called a tautology, denoted by N (p. 
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For example, a —> a is a tautology for any proposition a; this follows from the truth 
table for —by replacing j3 by a, in which case only the first and the fourth rows 
are consistent (both yielding 1). Introducing a new logical symbol o by stipulating 
that a o j3 is the same as (a —j3) A (j3 —>■ a), then one easily proves: 

Theorem D.8. The proposition a ^ fi is a tautology ijf (X and j3 are either both 
true or both false for each joint truth value of the atomic propositions they contain. 

Here a and j3 need not contain the same atomic propositions, but if they do, this 
proposition says that a o j3 is a tautology iff a and j3 have the same truth table. 

Here and in what follows, one should distinguish theorems about logic from the¬ 
orems within logic. The former are themselves derived from logical rules that can be 
formalized, as first done by Hilbert and his school in “meta-mathematics”. The lat¬ 
ter is what we now turn to, motivated by the above semantic intermezzo. The syntax 
of any logical system, such as propositional logic, is completed by stating axioms 
and deduction rules that enable one to prove theorems. In the case of propositional 
logic, these are propositions (i.e., expressions correctly formed from rules i) and ii) 
above) that can be derived from the axioms and deduction rules in a finite number 
of steps, starting with (some of) the axioms and applying (some of) the deduction 
rules to the previous step of the proof. The axioms are considered to be theorems, 
too. Theorems are often denoted by (p, and to show that a proposition (p is indeed a 
theorem we write h (p. Thus the question if h ^ holds is purely syntactic, and hence 
is independent of the truth-value of the atomic propositions p, in (p. 

This is a baby version of the fundamental idea of Boole mentioned above, that the 
possible meaning of mathematical symbols should not affect the validity of mathe¬ 
matical reasoning about them. Nonetheless, there is a consistency requirement (on 
the axioms and deduction rules) that one should not be able to derive ^ if ^ is se¬ 
mantically false under some truth assignment to the atomic propositions it contains. 
In other words, a theorem must be true for any truth assignment to the pertinent 
atomic propositions, or, then again, a theorem within propositional logic must be a 
tautology, symbolically: h (p implies N (p (meta-mathematically). This is the sound¬ 
ness condition on any logical system. Conversely, one would like to prove as many 
true propositions as possible. Optimally, this is expressed by the completeness con¬ 
dition that N (p imply h (p. If both hold, i.e., if a system is sound as well as complete, 
one has h ^ iff 1= (p: in (other) words, a proposition is a theorem ijf it is a tautology. 

Achieving this should be the goal of our axioms and deduction rules. This can 
indeed be done in propositional logic (and also in first-order logic, on a suitable in¬ 
terpretation of N, see §D.4). Even this requirement does not fix the axioms and the 
deduction rules, although it clearly makes any two such systems equivalent, in the 
sense that each leads to the same theorems (namely the tautologies). In particular, 
one can switch between axioms and deduction rules (matters like this were first sys¬ 
tematically sorted out by Hilbert and his school, notably Bemays and Ackermann, 
partly motivated by the Principia Mathematica of Russell and Whitehead). 

One particularly convenient choice has just a single deduction rule, namely: 

• Modusponens: if h a and \- a ^ j5, then h j3. 
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Even so, the axioms of propositional logic may be stated in many different ways. 
Although it is even possible to use a single logical symbol (namely the Sheffer 
stroke |, called NAND in computer science, where a|j3 means -■(a A j3)), we proceed 
less radically and initially use two symbols. To this end, it is easy to show that 

aAj3 A)>-'(a-5--•jS) (D.46) 

aVj3o-a^j3 (D.47) 

are tautologies, so that in principe the symbols V and A are superfluous, in that a A j3 
may be regarded as an abbreviation of -■(a — ~^P), and likewise, aW j5 stands for 
-•a p. A possible choice for the axioms that regulate -■ and —is: 

h j3 ^ (a ^ j3); (D.48) 

h(j3^(7^5))-^((j3-7 7)-7(j3-^5)); (D.49) 

h (-la- 7 -ijS) ((-■a-A j3)a). (D.50) 

The third axiom axiom settles the use of -■ and, jointly, with modus ponens, justifies 
proof by contradiction oireductio ad absurdum: suppose one has established 

h^a-Aj3; (D.51) 

(D.52) 

then (D.50) and modus ponens yield {-•a -A j3) —> a. Axiom (D.48) and modus 
ponens then yield a. Furthermore, as another proof technique (i.e. a theorem about 
propositional calculus) one can prove the deduction theorem: 

Theorem D.9. If a and ( 71 ,..., 7 „) imply j3, then ( 71 ,..., Jn) imply h a —>■ j3. 

Introducing an external implication symbol such statements are often written: 

(a,7i,...,7i)l-j3^(7i,...,7i)ha-^j3. (D.53) 

Writing the external “and” as a comma, one can similarly prove the rules 

p 5 ^ p ^ 5; (D.54) 

j3-A (7-:> 5),7^ j3-A 5. (D.55) 

As already mentioned, the central result about propositional logic is: 

Theorem D.IO. For any proposition (p, one has h (p iff'p (p. 

Proof. We only prove the easy direction. Axioms are tautologies, and modus ponens 
preserves truth, in that N a and N a —j3 imply N j3, as follows from the fourth row 
of the truth table for a —> j3. Hence each step in a proof preserves tautologies. □ 

Nonetheless, the notions of theorem and tautology are quite different conceptually: 
the first is defined syntactically, whereas the latter is defined semantically. 
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At the other end of the spectrum, we mention an axiom system that involves all 
four logical connectives (whilst keeping modus ponens as the only deduction rule): 


l-(j3A7)^j3; (D.56) 

h (j3 A 7 ) ^ 7 ; (D.57) 

hj3^(7^(j3A7)); (D.58) 

hj3^(j3V7); (D.59) 

h7-^(j3V7); (D.60) 

h(j3^5)^((7->5)->((j3V7)-^5)); (D.61) 

hj3^(7^j3); (D.62) 

h(j3-^(7-^5))-^((j3-^7)->(j3-A5)); (D.63) 

h ^j3 ^ (j3 ^ 7 ); (D.64) 

h(j3-^7)^((j3^-7)^-j3); (D.65) 

(D. 66 ) 

We now describe the relationship between propositional logic and Boolean alge¬ 
bras. Define an equivalence relation ~ on the set of propositions by 

(p ^ x^r iff xjf \- (p and 9 h t^, (D.67) 

where, as in (D.53), the notation (p means that (p can be derived from \j/, which 
is the case iff h y/ o (p. The ensuing set of equivalence classes 

(D. 68 ) 

is called the (classical) Lindenbaum (-Tarski) algebra for the given signature Z. 
Theorem D.ll. The set Lx defined by (D. 68 ) is partially ordered by 

[w]<[(p]iffV^ (P- (D-69) 

In this ordering, the ensuing poset is a Boolean algebra, with operations 

[\j/]V[(p] = [\l/^(p]-, (D.70) 

[\l/]A[(p] = [\l/A(p]-, (D.71) 

[W]^ = bw]- (D-72) 


Furthermore, the bottom and top elements of Lx are the equivalence classes of 
any contradiction and any tautology, respectively. The Boolean algebra Lx thus 
obtained is the free Boolean algebra on the set L, and hence any valuation 
(D.44) - (D.43), induces a homomorphism of Boolean algebras 

V :£gE^{0,l}. (D.73) 
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Here the free Boolean algebra fSz on a set Y. is defined as usual, namely as “the” 
Boolean algebra (unique up to isomorphism), along with an injection i : E ^ 
such that any map g : E ^ A, where A is some Boolean algebra, factors through i 
(i.e., there is a unique homomorphism / : —?► A such that g = foi). 

Constructions like this become more interesting for propositional theories, in 
which (beyond specifying the signature E) further axioms are added to whatever 
system for which Theorem D.IO holds. Let us call the list of such axioms where 
we assume that the theory is consistent, in that no contradiction can be derived 
from fi" (in propositional logic—as opposed to predicate logic—this question is 
decidable). We also assume that contains no tautologies (which would add no 
new theorems). We now write \- (p if (p can be derived (in a finite number of 
steps) from and the basic axioms and decuction rule(s). Unless is empty, the 
set of theorems will be larger now (e.g., any member of ^ itself, say h pi, is trivially 
a theorem of ^). In order to preserve Theorem D. 10, now in the form 

(piff (p, (D.74) 

we should define the right-hand side appropriately. Call a valuation (D.43), or, 
equivalently, the corresponding homomorphism (D.73), a binary model of if 

V{a) = 1, (D.75) 

for each a G ^ C (by soundness this is already the case for the axioms of 
propositional logic per se). We then say that £i~\^(piffV(<p) = 1 (i.e., (p is true) in 
any binary model of On this definition of \=, eq. (D.74), and hence Theorem 
D.IO (with ^ added to the axioms), holds. Moreover, for Ct, j3 G Bz, define 

a-^j3 iff ,^h(aoj3), (D.76) 

where the right-hand side stands for {SA, a) h j3 and a. Then define 

= Bl/ (D.77) 

and (partially) order by [yr] < [^] iff , \j/) h (p-, as before, this is equivalent 

with \~ {xj/ ^ <p). This construction obviously generalizes (D.67), etc. Then The¬ 
orem D.ll holds {mutatis mutandis) for In particular, is a Boolean 

algebra, which can also be shown to have the following universal property. 

A model of SA in some Boolean algebra Z? is a map V : E ^ B whose unique 
extension V : Bz ^ B makes the axioms of true, i.e. V{(p) = T for each <p G ^ 
(where T is the top element of B). Note that a i—[a] is a model of in L(^z,S)- 

Theorem D.12. For each model V \ E B of 3A, there is a unique homomorphism 
V '■ F(^z, 3 r) B of Boolean algebras such that V (ct) = V' {[(x\) for each a G Bz- 
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D.3 Intuitionistic propositional logic 

In view of its importance for quantum mechanics and topos theory, we now briefly 
discuss the intuitionistic version of the preceding material on (classical) proposi¬ 
tional logic. Intuitionism in mathematics originated with the Dutch mathematician 
L.E.J. Brouwer (1881-1966), who was also one of the most important early con¬ 
tributors to the field of (algebraic) topology. Brouwer held a rather subjective view 
of mathematics (sometimes even tending towards solipsism), in which mathematics 
primarily resided within the mind of the “creative subject” (perhaps the right trans¬ 
lation of Brouwers’ “scheppend” is: “creating” rather than “creative”). Any means 
of communication supposedly weakened this effort, so that Brouwer saw the formal¬ 
ization of mathematics (including logic) as secondary and even potentially danger¬ 
ous; he openly (and polemically) opposed his views to the “formalism” he attributed 
to Hilbert, with whom he also fell out personally. A more technical consequence 
of Brouwer’s intuitionism was an emphasis on explicit constructions, rejecting not 
only proofs by contradiction, but even the abstract existence of mathematical objects 
in general (as claimed by the so-called Platonic philosophy of mathematics). 

Brouwer’s lasting influence on logic is partly due to his student Arend Heyting 
(1989-1980), who was less radical than his teacher and formalized (!) intuitionis¬ 
tic logic analogously to its classical counterpart. In fact, the system (D.56) - (D.65), 
with modus ponens, gives axioms for intuitionistic propositional logic, which there¬ 
fore differs from classical propositional logic exclusively by the absence of the law 
of the excluded middle (D.66). It is customary in intuitionistic logic to use the purely 
logical symbols A, V, —and _L, in terms of which negation is defined by 


-■a = a _L. (D.78) 

In that case, axiom (D.65) is simply replaced by 

h _L ^ a, (D.79) 

and in the presence of (D.56) - (D.64) with (D.79), the axiom that makes the system 
classical may now be formulated as the validity of reductio ad absurdum, i.e., 

h ((a _L) ^ _L) ^ a, (D.80) 

which is therefore denied in intuitionistic logic. Similarly, classical rules like: 

aV-ia; (D.81) 

—'—'0! V “’Ct; (D.82) 

(D.83) 

(aj3)V (j3a); (D.84) 

—'Ct A —'jS) —> (o! V j3); (D.85) 

—'Ct V — ij3) —> (o! A j3), (D.86) 
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are invalid in intuitionistic logic, as is, of course, (D.66). Fortunately, as theorems 
of intuitionistic propositional logic one does have: 


h a -G -'-■a; 

(D.87) 

h -’-’-■a o -la; 

(D.88) 

\- (a ^ P) ^ (-'jS -> -la); 

(D.89) 

1—'CtV^jS “'(ctAjS); 

(D.90) 

1—'(ctV j3) —>■ {—'OC A “'jS); 

(D.91) 

\- (a ^ P) ^ (^j3 -•a). 

(D.92) 


More generally, Godel’s negative translation of classical (propositional) logic into 
intuitionistic (propositional) logic establishes the fact that if one puts -i-i in front of 
atomic propositions and recursively replaces a V j3 by -■(-■a A “'jS), which changes 
nothing classically, the ensuing proposition is intuitionistically valid. In this sense, 
intuitionistic logic is stronger than classical logic, although at first sight it looks 
weaker (as is has fewer axioms). Also more generally, one often sees that classi¬ 
cal results whose proofs apparently rely on intuitionistically invalid reasoning are 
classically equivalent to intuitionistically valid results, (e.g. Gelfand duality). 

A natural (and complete) semantics for intuitionistic propositional logic is given 
by Heyting algebras (replacing the Boolean algebras of the classical case). Let Iz 
denote the set of all propositions (i.e., well-formed formulae) on some signature T. 
built from the letters p G L and the symbols A, V, —and _L, where in formation rule 
i) preceding (D.42) we also declare _L to be a proposition, and we omit -la at the 
end of rule ii), as it is a special case of the preceding part with (D.78). If // is a 
Heyting algebra, we may then extend any function V : L —^ to a function 


V :Ie^H (D.93) 

by recursively using the following rules, where • is A, V, or —in Iz and —->■ in H: 

y(_L)=_L; (D.94) 

y(a*j3) =y(a)*y(j3). (D.95) 

Then each axiom (p of intuitionistic propositional logic is valid, in that 

y(^) = T. (D.96) 

Moreover, if F is some finite set of propositions, then 

r \- (p implies V (^/\F ) ^ ^(^)- (D.97) 


In particular, suppose we a theory As in the classical case, we call a valuation 
(D.93) a model of if (D.96) holds for each g) & 3^.li then follows from (D.97) 
that each model y of is sound in that for all propositions (p one has the rule: 
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\- (p implies V(<p) = T. (D.98) 

That is, (p is true in the given model. As in Theorem D.IO, soundness and com¬ 
pleteness of Heyting algebra semantics of intuitionistic propositional logic are then 
jointly expressed by the following result (where h denotes derivability using only 
the intuitionistically valid axioms (D.56) - (D.64) with (D.79), and modus ponens): 

Theorem D.13. For any theory SF in intuitionistic propositional logic, \- (p holds 
ijf ^ ^ ^(^) = T for all Heyting algebra models V : Iz ^ H. 

The classical construction of the Lindenbaum algebra may also be copied by defin¬ 
ing Lz and as (D.67) - (D.68), where this time the symbol h defining ~ 

through (D.67) or (D.76) is the one using the intuitionistically valid axioms only. 
It follows that any Heyting algebra model V : Iz H factors through a homomor¬ 
phism Lz H of Heyting algebras, just as in the classical case (cf. Theorem D. 11). 

Kripke models are special Heyting algebra models, which already form a com¬ 
plete semantics for intuitionistic propositional logic. For any poset X, the set 

Upper(X) = ff{X) (D.99) 

of all upper subsets U of X (i.e. y < x G U implies y € U), which by definition 

coincides with the set 0'{X) of open sets in the Alexandrov topology on X, is a 

Heyting algebra in the partial order defined by inclusion, with V = U, A = (T, and 

U—^V = {xGX \{tx)r\U <GV}. (D.lOO) 

Given a valuation V : E ^ Upper(A) with associated Heyting algebra homomor¬ 
phism V : Iz ^ Upper(A), for any x GX and ^ G /x we write xlh <p iff x GV {(p), 
and say that x forces (p. Then V{(p) =T iff x\h (p for all x GX, and we have: 


x\\~ <p and y > X imply y IF (D.lOl) 

xlh_L for no xG A; (D.102) 

x\\-(pA\j/ iff X IF ^ and X IF y/; (D.103) 

xlF^Vy/ iffxlF ^ orxIF y/; (D.104) 

X IF ^ ^ y/ iff for ally > x : y IF ^ implies y IF y/; (D.105) 

xll —>(p iff for ally > X, y IF ^ is false. (D.106) 


Hence these are properties of any homomorphism V : Iz ^ Upper(A); originally, 
(D.lOl) - (D.105), which imply (D.106), were taken to be axioms extending a binary 
“forcing” relation x IF p on A x A to A x In topos theory, generalizations of the 
rules (D.lOl) - (D.106), once again theorems rather than axioms, will provide the 
Kripke-Joyal semantics of the (intuitionistic) internal logic op toposes (cf. §E.5). 
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D.4 First-order (predicate) logic 

Propositional logic lacks the structure to describe arithmetic (not to speak of set the¬ 
ory), because it has neither variables—as we shall see, the symbols p, are not vari¬ 
ables but predicate symbols —nor quantification symbols like ‘there exists’ (3) and 
‘for air (V). This defect is remedied by the formalism of predicate logic, also called 
first-order logic, which was essentially introduced by Frege and was adopted by 
Hilbert’s school as a universal language for mathematics (as they knew it), in which 
for example the Zermelo-Fraenkel (ZF) axioms for set theory may be formulated 
as a foundation of mathematics (against competitors like the Principia Mathematica 
system of Russell and Whitehead, and others). A simple mathematical theory that 
can be formalized using classical first-order logic is Peano Arithmetic (PA). 

• The notation of a first-order theory consists of symbols from two groups: 

1. The purely logical symbols are the familiar symbols -i, A, V, —> from proposi¬ 
tional logic (or some logically independent subset thereof, such as -■ and -A), 
supplemented by the equality sign = and the quantification symbols V and 3 
(the latter is in fact superfluous in the classical system discussed here, since, 
the combination defined below is the same as -'V;^-'). 

2. Unlike the ones above, the non-logical symbols (comprising the signature of 
the theory) depend on the field of mathematics to be formalized (such as set 
theory or arithmetic), but the general format is as follows. One has: 

a. Variables a,b,c,... ,x,y,z,xi,X 2 , ■ ■ ■, assumed countable many at most. For 
example, in PA these variables may be thought of as denoting natural num¬ 
bers, whereas in ZF they will be sets, but of course such interpretations do 
not form part of the syntax! This warning also applies to the next items. 
In many-sorted theories the variables are sorted, in that there is a set 
{A,B, ...} of sorts, and each variable x = xa belongs to one of these sorts. 

b. Constants, arbitrarily formatted. For example, PA has just one constant, 
called 0, to be interpreted as the number zero. Also ZF has just a single 
(even superfluous) constant 0, to be interpreted as the empty set. 

c. Function symbols f,g, _Each such symbol has an arity a{f), which is 

a natural number indicating the number of variables it has (as formalized 
below). Formally, one allows «(/) = 0, in which case / is also a constant. 
PA has three function symbols, viz. S, 3-, and x, with arities a{S) = 1, 
a(-l-) = 2, and «(x) = 2 (these will be interpreted as the successor function 
ni-^ n+1, addition, and multiplication, respectively). Perhaps surprisingly 
(especially in the light of category theory), ZF has no function symbols: in 
set theory, functions f :X are defined as special subsets of A x T. 

d. Predicate symbols P,---, coming with an arity a{P) G N, too. These will 
play a role in the construction of formulae, see below (some authors count 
= as a predicate symbol with arity 2, instead of as a purely logical symbol). 
PA has no predicate symbols. ZF has one predicate symbol G, with arity 2. 
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• According to rules we are about to state, from these symbols one subsequently 
constructs terms, formulae, and sentences (or closed formulae). Sentences are 
at least candidates for theorems, in that one may attempt to prove them (and may 
either succeed or fail, the latter even in two possible ways: the sentence may be 
false, in that its negation can be proved, or it may be undecidable, in that neither it 
nor its negation can be proved—it was Hilbert’s outspoken intention to exclude 
the last possibility, which however was famously shown to be unavoidable by 
Godel). For example, both = I and = 1) are formulae in PA, but only the 
latter is a sentence, which is even a theorem. The rules, then, are as follows. 

1. Term formation is done by iterating the steps: 

a. Any variable x, is a term. 

b. Any constant is a term. 

c. Any function symbol / and any set of k = a{f) terms (fi,... , 4 ) jointly 
yield a term f{t \,... , 4 ); if a{f ) = 0 this reduces to the previous case. 

In PA, this means that S{t) is a term, and that fi +4 = +( 4 , 4 ) and fi x 4 = 
X ( 4 , 4 ) are terms (provided 4 4 , and 4 are terms). For example, the constant 
0 is a terms, and hence 5(0) is a term, which one calls 1. Similarly, 5"(0) is 
a term called n, where e.g. 5^(0) = 5(5(0)), etc.). From these, we can make 
terms n + m, or n x x,-, and subsequently (n + m) x (n x x,), enz. 

In ZF, the only terms are 0 and the variables (as ZF lacks function symbols). 

2. Formulae are (once again iteratively) constructed from terms using the equal¬ 
ity sign = and the predicate symbols, according to the following rules: 

a. If 4 and 4 are terms, then 4 = 4 is a formula. 

b. Any predicate symbol P and any set ofk = a{P) terms (fi,... , 4 (-/>)) jointly 
yield a formula P{ti ,4(/>)); if a{P) = 0, then P is a formula by itself. 

c. As in propositional logic: if (p and \j/ are formulae, then so are -^(p, (pW xj/, 
(p Axj/, and (p xj/. What is new to first-order logic is that also B^tp and 
\/x(p are formulae, for any variable x (which may or may not occur in (p). 

In PA, the expression 4 = 4 is a formula (provided 4 and 4 are terms). 

In ZF, the expressions 4 G 4 and 4 = 4 are formulae (if 4 and 4 are terms). 

3. A variable x occurring in a formula (p is called bound if it only occurs via 
WxXj/fx) and/or 3xXi/i{x), where y/ is a subformula of (p-, otherwise, x is free. A 
formula containing at least one free variable is called open-, if x occurs freely 
in an open formula <p, the latter is sometimes called ^(x), and analogously 
^(xi,... ,x„). If all variables in a formula are bound (or if it contains no vari¬ 
ables at all), then it is said to be closed. A sentence is a closed formula. 

• Axioms are, syntactically speaking, special cases of formulae. As for propo¬ 

sitional logic, we may either keep A and V, and add adding (D.46) - (D.47) 
as axioms, or, equivalently, we may see a A j3 and a V j3 as abbreviations for 
-•{a —-■jS) and -•a j3, respectively. Similarly, we may see 3 as a derived 

symbol, in that 3x<P is an abbreviation for -A/x-'Cp. 

As in propositional logic, the axioms for predicate logic come in two groups: 
purely logical axioms and domain specific axioms. We will state the latter for the 
theories PA and ZF in §D.5 below, and now discuss the former (common to both). 
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From propositional logic, we adopt (D.48) - (D.50), where a, j3, 7 ,5 are arbitrary 
formulae. These are also Axioms 1-3 of predicate logic, to which one adds: 

Axiom 4 : h —>■ <p{t) for any term t (unless x occurs freely in (p through 

a subformula \/y\j/ where y occurs in f); some authors write (p{x/t) for (p{t). 

Axiom 5 : h {\/x{(P W)) ^ '^xW)- 

Axiom 6 : h V_t(x = x). 

Axiom 7 : h VjcV),((x = y) —>■ {(p{x) ^ (p{y))) for each formula (p that contains the 
variable x freely and contains y either/reely or not at all. 

• The only two deduction rules of predicate logic (for formulae yr) are: 

1. Modusponens: \- {(p ^ \j/) and h (p imply h \j/. 

2. Universal generalization: h ^(x) implies h \/x(p{x). 

These rules also apply to theories, provided that in the second, (p{x) implies 
h Vx<p(x) provided no formula in used in the proof of (p freely contains x. 

• A theorem is a sentence (p that can be proved form the axioms using the deduc¬ 
tion rules in a finite number of steps. In that case, we write h (p. 

• A theory is a set of formulae (assumed contradiction-free, although for e.g. ZF 
this cannot been proved within ZF because of Godel’s Incompleteness Theorem). 

• An interpretation of a theory 17 consists of a nonempty set M (the carrier of 
the interpretation), elements [[c]]m G M for each constant c, functions [[/]]m : 

M for each function symbol / of arity a{f), and subsets [[F]]m C 
for each predicate symbol P of arity a{P). The interpretation of a formula 

(p then follows by giving the logical symbols - 1 , A, V, — and = their usual mean¬ 
ings of “not”, “and”, “or”, “implies”, and “is equal to”, whereas the range of each 
variable x occurring in \/x or 3x in taken to be M. If a sentence (p is true in this 
interpretation, we write M\= (p.lf each axiom of is true, then we call the given 
interpretation a model of 7 (so that in a model, 17\- <p implies M \= (p). 

Gddel’s Completeness Theorem (to be contrasted with his incompleteness theorem, 
which roughly states that any first-order theory that incorporates PA contains unde- 
cidable sentences) generalizes Theorem D.IO and eq. (D.74) to first-order logic: 

Theorem D.14. A first-order theory 17 is consistent iff it has a model. In that case, 
a sentence ^ of 17 is a theorem iff it is true in all models of 17. 

Propositional logic is a special case of predicate logic, namely by assuming 
no variables, constants, and function symbols, and taking the atomic propositions 
(pi,...) to be predicate symbols with arity zero (or else { 0 , l}-valued variables). 
The rules of term formation in predicate logic then show that propositional logic has 
no terms, so that step 2.a above is empty, and step 2.b only yields the p,. These may 
be turned into compound expressions by the original uses of propositional logic, 
which in this case coincide with the rules of predicate logic (since there are no 
variables, 3x(p and \/xtp both equivalent to <p). Finally, formulae coincide with 
sentences, since in the absence of variables, all formulae are closed. 
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As a transition to the next appendix, we continue our discussion on intuitionistic 
logic started in §D.3. The propositional fragment of first-order intuitionistic logic is 
still given by (D.56) - (D.65), in which the connectives A,V,—and -■ (or _L) are 
independent. The equality sign = is treated with suspicion in intuitionism, and hence 
is omitted, whilst 3 can no longer be defined in terms of V through the classical 
identification of 3x with -■Vjc-i. Instead, it is regulated by the two axioms 

\-{yx{(p ^ w)) ^ (D.107) 

h(p{t)^3x(p{x), (D.108) 

subject to the same proviso as Axiom 4 of the classical case, plus a deduction rule: 

• 3-elimination: \^3x<p implies h (p (provided x is not free in (p). 

This will be the logic on which the topos theory of the next chapter is based. Scary 
examples of intuituitionistically invalid rules involving V and 3 include: 


-.Vx-'^(x) o 3x(p{x)-, (D.109) 

\/x^^(p{x)^yx(pix); (D.llO) 

-•^3x(p{x) o 3;c-.-.^(x); (D.l 11) 

{(p^3x\j/ix))^3x{(p^\j/{x)), (D.l 12) 

whereas useful intuituitionistically valid theorems containing V and 3 are, e.g., 

~'^x(p{x) ^yx~'(pix)', (D.l 13) 

^^Vx(p(x)^Vx^^(p(x); (D.l 14) 

-'-•3x^(x) -'Vx^^ix). (D.115) 


Godel’s negative translation of classical logic to intuitionistic logic extends to first- 
order logic: if, further to the manipulations mentioned after (D.92), one also replaces 
3x(p{x) by -'\/x-'(p{x), then theorems (p of classical first-order logic are turned into 
theorem of intuitionstic first-order logic. Although we will not use it, we mention 
that the notion of a Kripke model also extends from propositional to predicate in¬ 
tuitionistic logic: compared to a classical model carried by a set M, as described 
above, we now have family of (classical!) sets {Mp) indexed by some poset P, 
in which constants, functions, and predicate symbols are similarly interpreted as 
families G M*, ([[/]]m^ : Mx), and ([[P]]m^ C such that if 

x<y, then Mx C My, = [[c]]m,, G([[/]]mJ C G([[/]]m^) (where G(/) is the 

graph of /), and C Further to the forcing rules (D.lOl) - (D.106) for 

intuitionistic propositional logic, there are additional ones for 3 and V, viz. 

xlh3^(x) if there exists m€Mx such that xlh (D.116) 

X IF Vx<p(x) if for ally > x and allm G My one has y IF (D.l 17) 

We will revisit these rules in topos theory, see §E.5; indeed, Kripke models for 
intuitionistic predicate logic emerge much more naturally in categorical language. 
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D.5 Arithmetic and set theory 

Completing our running examples (for classical first-order logic), we now give the 
theories PA and ZF, starting with the axioms of Peano Arithmetic: 

PAl hV,(-(5(x)=0)); 

PA2 I-VjcV3,(5(x) =5(y) —>'X = y); 

PA3 hV^(x + 0=x); 

PA4 h VjcV3,(x-l-5(y) = 5(x-l-y)); 

PAS hV^(xx0==0); 

PA6 h X 5(y) = (x X y)-fx); 

PA7 (^(0) A {\/x{(p{x) —^ (p{S{x)))) —>■ Vxtpix), for any formula ^(x). 

Thinking of the variables in question as natural numbers (which is what Peano him¬ 
self still did), these axioms obviously capture their properties pretty well and may 
require no further explanation (except perhaps the last one, which enables the proof 
technique of induction). The point, however, is that the axioms only form a syntax; 
the natural numbers N (as a set) themselves form a model of PA in the general sense 
discussed in the previous section (though by no means the only possible model, and 
hence N is called the standard model of PA). In particular, this means that the set 
N is assumed to be known (e.g. via ZF, see below), upon which the interpretation 
of some formula tp in PA is determined by the rules given earlier. In particular: 

• The constant 0 is interpreted as the number zero. 

• The function (symbol) S is interpreted as S{x) = x + 1, whilst the functions -f 
and X are interpreted as addition and multiplication, respectively. 

• The range of all variables is taken to be N, i.e., \/x means “for all x € N”, and 3x 
means “there exists x G N”. 

According to the general definition, a sentence tp of PA is then called true in the 
given model (i.e., in the natural numbers) if is true, in which case we write 

N N For example, [[VjcV 3 ,(x-|-y = y-|-x)]]N means that for all natural numbers x,y G 
N, one has x-l-y = y-|-x (which is true, isn’t it). Another example is 1 -f 1 = 2, which 
abbreviates 5(0) -I- 5(0) = 5(5(0)). The interpretation of [[1 -f 1 = 2]]f^ is given by 
1 -f 1 = 2 (which once again is true!). In particular, the above axioms of PA are true 
in this interpretation. The key conceptual point here is that (following Hilbert) one 
interprets a theory in a domain that is supposed to be known and consistent, so that 
it has its own methods of proof (for otherwise the semantic entailment symbol N 
would be undefined). In this particular case, the domain is ZF set theory (or at least 
its lower echelons); see the comments to axioms ZF7 below. 

It is quite instructive to see the crucial role of the seemingly technical axiom PA7. 
Suppose we try to define a model of PA in the set Q+ of positive rational numbers 
(including zero), so that Mx means “for all x G Q^”, and 3;^ stands for “there exists 
X G Q^”; the number zero (as the interpretation of the constant 0) and the functions 
5, -f, and x have their usual meaning, however. Then all of PA1-PA6 hold, but PA7 
fails, and hence the given interpretation of PA in Q+ is not a model of PA. 
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The axioms of ZF are a trifle more complex than those of PA, but then they are 
supposed to describe all of mathematics! We use the following abbrevations: 



III 

(D.118) 


a GA j3 = (a j3) A (j3 a); 

(D.119) 


x^y = -'(x = y); 

(D.120) 


x^y = -'(x G y). 

(D.121) 

Other notation of ZF will be explained in the text following the axioms, which are: 

ZFl 

1" V;c,j,((Vj(z G xGAz G y)) GGx = y) 

(Extensionality) 

ZF2 

h V^3yVj(((z G x) A (p(z)) GA z G y) 

{Separation) 

ZF3 

1 - '3;fX G 0 

{Empty set) 

ZF4 

h Vv,w3yV^(z G y GA (z = v) V (z = w)) 

{Pairing) 

ZF5 

h V^3yV^(z G y GG 3„e^z G w) 

{Union) 

ZF6 

h Vx3yVj (z G y GG z C x) 

{Power set) 

ZF7 

h 3;t(0 G xA Vy(y G X— y+ G x)) 

{Infinity) 

ZF8 

1- Vu((Vxeu3lz<P(x,z)) 3yV^(z G y gg 3^eu<p(x,z))) 

{Replacement) 

ZF9 

h Vv ^0 3xev V),(yGx—^-y^ v) 

{Regularity) 

AC 

h V„3 h,((w C ^{u) X m) a 0 —>■ 3\y(zx < x,y >G w))) {Choice) 

In ZF2 and ZF8, <p(-) is an arbitrary formula with at least the specified free vari- 


ables, so that these axioms are more properly thought of as axiom schemes. 

These axioms have been the subject of entire monographs, but we will be brief 
here. All intuition about the axioms comes from “naive” sets, although the whole 
point should be that the axioms stand on their own, and circumvent the problem 
of defining sets conceptually (as Frege and Cantor desperately tried to do, much 
as Euclid tried to define in vain what a point is, before he was was liberated by 
Hilbert). The axioms may be put into two groups: Axioms ZFl, ZF3, ZF9, and 
AC are concerned with given sets, whereas nos. ZF2, ZF4, ZF5, ZF6, ZF7, and 
ZF8 regulate the way new sets may be constructed from old ones. Here are some 
comments on the axioms one by one (which should, however, be seen as a whole). 

ZFl states that a set is determined by its members (which themselves are sets!). 
ZF2 is a correct version of the naive idea of Cantor, Dedekind, and Frege that 
every property (or predicate) defines a set. If we look at a predicate as a formula 
(p{z) stating that z has a certain property, the naive idea of these gentlemen was 
that y = {z\ ^(z)} is a set. This idea would be secured by the axiom 

3j.Vj:(^(z) ozGy), (D.122) 

which however leads to Russell’s Paradox (in which (p{z) = z^ z). 
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The crucial difference between ZF2 and this naive version is that in ZF one re¬ 
stricts set formation to those z that satisfy <p(z} and are a member of some set x 
that is already given. By ZFl, the set y defined by ZF2 is unique; it is written as 

y = {z€x\(p{z)}. (D.123) 

This notation introduces the familiar brackets {• • •} from naive set theory, which 
are therefore derived concepts not belonging to the notation of ZF. This is also 
true for most of the other symbols from naive set theory (except G, which is a 
predicate symbol in ZF). For example, for arbitrary “sets” x and v (which so far 
are really just variables in ZF), we introduce xflv as a name (i.e., an abbreviation) 
for the set y defined by taking (p{z) in ZF2 tobez G v. Using the notation (D.123), 
this defines the symbol fl (for “intersection”) by 

xn V = {z G X I z G v}. (D.124) 

ZF3 states that 0, which was the only constant in ZF, has no elements. According 
to ZFl this set is unique, so that 0 may be thought of as the empty set. In partic¬ 
ular, ZF3 implies that there are sets in the first place (instead of defining it as a 
constant, one could alternatively introduce the symbol 0 at this stage). 

An equivalent form of ZF3 is: h V;c-'(x G 0), also written as V^x 0. 

ZF4 states that for given sets v and w, there exists a set y with exactly those two 
members. We write this y as y = {v,w}, which uses brackets {• • •} consistently: 
in ZF2 take <p(z} to be (z = v) V (z = w) and take x to be the y just considered. 
This may be iterated, so that we may write {xi,... ,x„} for the set y that satisfies 

Vxi,...,:t„3j,V2(zGy GG (z = xi)V---V(z = x„)); (D.125) 

this set is unique by ZFl. Using the notation from ZF2 we may then write 

{xi,...,x„} = {zGy| (z = xi)V---V(z = x„)}. (D.126) 

ZF5 postulates the existence of a set y whose elements are the elements of x. In 
this axiom, the generic notation 

3wga:</=3vv((wGx)AV7), (D.127) 

is used, where \j/ is some formula, which in ZF5 is z G w. We write y = Ux, which 
defines the symbol U, i.e., 

Ux={zGy\3wexZGw}, (D.128) 

where y = Ux is the set whose existence is guaranteed by ZF5. In the special 
case X = {xi,... ,x„}, we write 

xi U • • • U x« = U{xi,... ,x„}. (D.129) 
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ZF6 calls for each x to have a power set y. The notation 

z C X = \/y(y G z—^ y G x), (D.130) 

defines the symbol C; note that z = x is allowed, so that C=C. As usual, the set 
y is unique due to ZFl, and is denoted by ^{x), whose elements are therefore 
the subsets z of x. We may write this a la (D.123) as (y being the set from ZF6); 

^(x) = {z G y I z Cxj. (D.131) 

ZF7 postulates the existence of a set y whose elements are 

0,0+ = {0}, {0}+ = {0, {0}}, {0, {0}}+ = {0, {0}, {0, {0}}},... (D.132) 

in which the notation 

y+=\J{y,{y}}=yU{y}, (D.133) 

is underwritten by ZF5. Hence the elements of y+ are the elements of y, supple¬ 
mented with the single element y. Following von Neumann, the sets in (D.132) 
are called 0, i,2,3,..., respectively, where 0 is identified with the empty set, and 
n > 0 is realized in a very specific way. Thus ZF7 states the existence of a set con¬ 
taining 0, i,2,3,.... The intersection of all sets with this property is the smallest 
set containing 0,1,2,3,...; this is the smallest infinite set, called co. In the stan¬ 
dard model of ZF (see below), co is (a copy of) the set N of natural numbers. 

ZF8, in which cp should not contain y, states that if some formula (p(x,z) assigns 
exactly one z to any given x, then these z form a set, provided the variables x form 
a set (i.e., u). Such a formula cp is really a function / so that /(x) = z, and hence 
this axioms states that the image of any set under some function is again a set. 
Using the notation (D.123), we then have 

f{u) = {zGy\3xeu(p{x,z)}. (D.134) 

ZF9 is the most contrived axiom in ZF, stating that every nonempty set v contains 
some element x disjoint from x. Its formulation uses the generic abbreviation 

= '^v{i3zZ G v) ^ \lf) (D.135) 

Using the symbol fl from (D.124), one easily checks that Vy(y G x —>■ y ^ v) is 
the same as xfl v = 0, in terms of which ZF9 reads 


^Vv^0 3;,ev(xnv = 0). (D.136) 

This implies x ^ x, which avoids all kinds of paradoxes (though not Russell’s, 
which was taken care of by ZF2). Moreover, ZF9 enables transfinite induction. 
AC warrants the choice of an element of each nonempty subset of any set. Indeed, 
rewriting the expression 3\y^x{x,y) G w as 3!yg„((x,y) G w Ay G x), AC reads 
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|-V„3vv((wC ,^(m) XM)A(V^g^(„)(xy^0H>3!j,e„(x,y) G wAySx))). (D.137) 

As we shall see shortly, this shows that there exists a function 

f\^(u) — >u (D.138) 

that maps x G ^P{u) to f{x) G u, such that ^xed^iu) (x 0 ^ /(x) G x). Although 
3^ is undefined in (first-order) ZF, one may therefore informally rewrite AC as 

^ (D.139) 

We now formally define functions, which, as already noted, are curiously absent 
in ZF (which lacks function symbols). This relies on the following theorem of ZF: 

l-V„,vV;f, 3 ,((xG M)A(y G v)) {W,{Ay}}G ^(^(mUv)). (D.140) 

We now introduce the abbreviation 

<x,y>={{x},{x,y}}, (D.141) 

which by (D.140) is an element of the double power set (assuming 

that X G M and y G v); this notation makes < x,y > an ordered pair, as opposed to 
{x,y} = {y,x}. The (cartesian) product of two sets u and v is now defined as the set 

M X V = {z G I 3xeu^yevZ =<x,y >}, (D.142) 

i.e., in ZF2 we substitute x 3^(0^{uVJv)) as well as 

^>{z)'^3x€u^yevZ=<x,y>, (D.143) 

and denote the (unique) set y thus defined by m x v. Informally, one often writes 

M X V = {< x,y >1 X G M,y G v}. (D.144) 

We are now in a position to define functions in ZF set theory: 

Definition D.15. A function f :u^ V is a subset Gf C ux v for which 

VjcGi,3!ygy < x,y > G G/. (D.145) 

Here 3!3,gyV/'(y) abbreviates 

33,((y G v) A (f/ziwiz) ^z = y))), (D.146) 

cf. (D.127)), which yields (D.145) upon the substitution (//(y) x,y > G Gf. 
More generally, one has 

^hWi.y) = ^y'^z{w{z)^z = y). (D.147) 
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Hence in ZF set theory a function / is defined by (or even identified with) its 
graph Gf, which closes the historical circle: Newton clearly looked at what we now 
call functions through their graphs, upon which Euler began to assign some value 
f{x) G V to X G M (though always through some concrete prescription). The 19th 
century brought the abstract idea of a function as a map between sets, which, as we 
just saw, ZF set theory replaced by the view that a function is defined by its graph. 

Compared to the standard interpretation of PA in the natural numbers, which was 
a special case of the general notion of a model described in §D.4, the standard model 
of ZF is unusual, in that its carrier is not a set (but a so-called class), called the set- 
theoretic universe (or cumulative hierarchy) V, whose construction was first given 
by none other than von Neumann, whose name already pervaded this book. We 
will not go into the details of this construction except by noting that—much as the 
natural numbers may be built from zero by repeated use of the successor function 
S —the universe V is constructed from the empty set 0 by “repeated” use of: 

• The successor operation V i-A V + = V U {V}, cf. ZF7. 

• The union operation V i-A UV, cf. ZF5. 

• The power set operation V >—>■ J^(V), cf. ZF6. 

However, what is really meant here by “repeated” defies imagination (and may drive 
one crazy); fortunately, most of mathematics only uses the lower echelons of V. 

Furthermore, interpreting the constant 0 by the usual empty set (with the same 
name), the interpretation e of G in V needs to be defined. This is done as follows: 

1. There exists no set Z such that Ze 0. 

2. One has Ze y+ iff Ze V or Z = V . 

3. One has Ze |J^ iff there exists WeV with Ze W. 

4. One has Ze I^{V) iff Z C y, where we say Z C y iff for all TeZ one has Fey. 

Here V, Y, and Z are sets in V. Applying these rules “iteratively” (see, however, the 
above comment on “repeated”), for all sets X and Y in V, it can in principle be estab¬ 
lished whether or not XeY, so that the symbol e is defined within V. Having access 
to the universe V, e, and the empty set 0, one may then define the interpretation 
[[^]]v of some formula (p of ZF in V by the following rules (cf. PA and N): 

1. The range of all variables is V, i.e., \/^(p{x) means that ^(y) holds for all y G V. 

2. The constant 0 is interpreted as the empty set. 

3. The predicate symbol G is interpreted as the membership relation e. 

A sentence (p in ZF is then true, denoted by V1= if [[^]]v is true. For example, all 
axioms of ZF are true in this interpretation (which is by no means trivial!). 

In particular, in this model we interpret h (see the explication of ZF7 above) as 
the n-fold iteration of the successor operation to 0, i.e., h — 0^ ^ (with n pluses), 

seen as an element of V, and recover the standard model of the natural numbers 
(and hence the carrier of the standard interpretation of PA) as N = U„n, which is the 
intersection of all sets in V that contain all sets h (for any finite n). 
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Notes 

The “modernist” transformation of mathematics led by Hilbert, including its com¬ 
plete prehistory and aftermath, is delightfully described in Gray (2008). The revo¬ 
lutionary nature of Hilbert’s views, which started with his influential book Grund- 
lagen der Geometrie from 1899, is nowhere clearer than from his correspondence 
with Frege (cf. Gabriel et al, 1980), who, though one of the fathers of the formal¬ 
isation of mathematics (specifically through first-order logic), infuriated Hilbert by 
stating that the latter did not bother to define the notions of “point” or “line” because 
Hilbert assumed these to be familiar to his readers. But no, quite to the contrary: 

‘Hier liegt wohl der Cardinalpunkt des Misverstandnisses (...) Ich will nichts als bekannt 
voraussetzen (...) Wenn ich unter meinen Punkten irgendwelche Systeme von Dingen, z.B. 
das System: Liebe, Gesetz, Schornsteinfeger ..., denke und dann meine samtlichen Ax- 
iome als Beziehungen zwischen diesen Dingen annehme, so gelten meine Satze, z.B. der 
Pythagoras, auch von diesen Dingen.’ (Hilbert to Frege, 29-12-1899).* 

This may be an exaggeration, however. Einstein probably came closer to the truth: 

‘An dieser Stelle nun taucht ein Ratsel auf, das Forscher aller Zeiten so viel beunruhigt hat. 
Wie ist es moglich, daB die Mathematik, die doch ein von aller Erfahrung unabhangiges 
Produkt des menschlichen Denkens ist, auf die Gegenstande der Wirklichkeit so vortre- 
fflich paBt? Kann denn die menschliche Vemunft ohne Erfahrung durch bloBes Denken 
Eigenschaften der wirklichen Dinge ergriinden? 

Hierauf ist nach meiner Ansicht kurz zu antworten: Insofem sich die Satze der Mathematik 
auf die Wirklichkeit beziehen, sind sie nicht sicher, und insofern sie sicher sind, beziehen 
sie sich nicht auf die Wirklichkeit.’ (Einstein, 1921).^ 

The great irony is that Hilbert’s call for abstraction, which at first sight decoupled 
mathematics from its origins in physics and other applications, in fact very rapidly 
led to the deepest applications of mathematics to physics so far, such as the use of 
(pseudo) Riemannian geometry in general relativity, and the use of Hilbert (!) spaces 
and operator algebras in quantum mechanics. In the present book, a high point of this 
paradox is the use of Grothendieck toposes (cf. Appendix E) in quantum mechanics 
(see Chapter 12), especially because Grothendieck himself almost made a sport of 
extreme abstraction, partly motivated by internal mathematical needs in algebraic 
geometry, but undoubtedly also by his indignation about the use of (mathematical) 
physics for military purposes (which put him diametrically against von Neumann). 

* This is surely the central point of the misunderstanding (...) I do not want to assume anything 
as known (...) If I interpret my notions by arbitrary things, for example, by the system: love, law, 
chimney sweeper, and subsequently interpret my axioms as relations between these things, then 
my theorems, like the one of Pythagoras, hold about these things. (Translation by the author) 

^ At this point an enigma presents itself, which in all ages has agitated inquiring minds. How 
can it be that mathematics, being after all a product of human thought which is independent of 
experience, is so admirably appropriate to the objects of reality? Is human reason, then, without 
experience, merely by taking thought, able to fathom the properties of real things? 

In my opinion the answer to this question is, briefly, this: as far as the propositions of mathe¬ 
matics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality. 
(Translation: Sonja Bargmann) 
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§D.l. Order theory and lattices 

For lattice theory in general and Stone’s Theorem see Givant & Halmos (2009), 
Davey & Priestley (2002), and Johnstone (1982). For (D.36) - (D.37) see Theorem 
33 in Chapter 35 of Givant & Halmos (2009). 

§D.2. Propositional logic 

Halmos & Givant (1998) is an elementary exposition of the connection between 
Boolean lattices and logic. Other useful (propositional as well as first-order) logic 
texts include Bell & Machover (1977), Johnstone (1987), Kaye (2007), and Mendel- 
son (2010). 

§D.3. Intuitionistic propositional logic 

Key writings on intuitionism (at least from the Dutch school) include Brouwer 
(1907, 1918, 1975), Heyting (1956) and Troelstra & van Dalen (1988). See also 
Dummett (2000) for a view from abroad. Our treatment of Kripke models for intu- 
itionisistic propositional logic is taken from Goldblatt (1984) and Palmgren (2009). 

§D.4. First-order (predicate) logic 

For the history of first-order logic see Grattan-Guinness (2000) and Mancosu, 
Zach, & Badesa (2004), plus innumerable books about Frege, Russell, Hilbert, etc. 
It is regrettable that the close companionship of mathematics and philosophy at the 
time, whose cross-fertilization has given us both the modem foundations of mathe¬ 
matics on the one hand and analytic philosophy on the other, has not lasted. 

§D.5. Arithmetic and set theory For pa see e.g. Kaye (1991), which focuses on 
non-standard models. The bible of ZF set theory is Jech (2006). 






Appendix E 

Category theory and topos theory 


This appendix gives a brief introduction to category theory, moving towards the par¬ 
ticular categories that are of interest to quantum theory (viz. categories of presheaves 
and sheaves) as quickly as possible (but not more quickly). However, even the basic 
setup of category theory is already relevant for e.g. the conceptually most satisfac¬ 
tory formulation of Gelfand duality, as described below Theorem C.23 (see also 
Theorem C.45), and likewise of Stone duality, see Theorem D.5. Otherwise, this 
material will only be used in Chapter 12 on quantum logic. We omit most proofs. 

Categories were originally introduced by Eilenberg & Mac Lane (1945) in order 
to define natural transformations, through which they formalized (and explained) 
the intuition that certain isomorphism in mathematics are “natural” or “canonical” 
(like the one between the second dual V** of a finite-dimensional vector space V and 
V itself, as opposed to the isomorphism between V* and V). Natural transformations 
are predicated on categories and functors, i.e. maps between categories, which are 
analogous to continuous functions between topological spaces, and in turn give rise 
to new categories, similarly to functions giving rise to function spaces in functional 
analysis. Initially meant to organize certain fields of mathematics in a systematic 
way (such as algebraic topology and homological algebra), categories soon became 
objects of study in their own right. As such, the basic vocabulary of category theory 
is completed by defining adjoint functors (invented by Kan in 1958) and (co)limits. 

Toposes are categories with enough structure to support the interpretation of first- 
order (and even higher-order) intuitionistic logic, similar to set theory providing 
semantics for classical predicate logic, which in turn generalizes the relationship 
between propositional logic and Boolean algebra, cf. §D. In this respect, the pres¬ 
ence of a truth object (i.e. subobject classifier) partly explains their potential rel¬ 
evance to quantum mechanics. However, toposes were introduced in the 1960s by 
Grothendieck from a completely different motivation, namely algebraic geometry, 
and were originally seen by him as generalizations of topological spaces. This as¬ 
pect plays an equally important role for quantum mechanics, and hence we quote: 

‘A startling aspect of topos theory is that it unifies two seemingly wholly distinct mathe¬ 
matical subjects: on the one hand, topology and algebraic geometry, and on the other hand, 

logic and set theory.’ (Mac Lane & Moerdijk, 1992, p. 1). 

© The Author(s) 2017 805 

K. Landsman, Foundations of Quantum Theory, 

Fundamental Theories of Physics 188, DOI 10.1007/978-3-319-51777-3 
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E.l Basic definitions 

The definition of a category emphasizes the idea that one is at least as interested 
in the maps between objects as in the objects themselves. The only complication 
(which we ignore) is the uses of classes', categories are often too big to be sets, and 
hence they require an axiomatization of mathematics different from standard ZF set 
theory (such as von Neumann-Bernays-Godel set theory or algebraic set theory). 

Definition E.l. A category C = {Ci,Co, i,s,t,m) consists of: 

• A class Co o/objects. 

• A class Cl o/arrows (also called morphisms). 

• Maps s : Cl —>• Co (the source map), f: Ci —>• Cq (the target map), i : Co —> Ci 
(the identities map), and m: C 2 = Ci xc^Ci —>^Ci (multiplication), where 


CiXCoCi={(/,g)GCixCi |i(/)=f(g)}, (E.l) 

such that, writing fg = m(f,g) and idjc = i(x), 

*(/^) = *(^); (E.2) 

t{fg)=t{f)', (E.3) 

{fg)h = f{gh)', (E.4) 

s(id;t) =t(idjc) =x; (E.5) 

= id,(^)/ = /. (E.6) 


f 

Note that (E.4) is well defined by virtue of (E.2) - (E.3). We often write x -G y or 

/ 

f: X ^ y ov, even better in principle but cumbersome in practice (see below), y^x, 
when / G Cl satisfies s{f) = x and f(/) = y, and interpret / as an arrow from x to 
y, so that id;^ is an arrow from x to x. Composition f o g = fg of arrows is defined 
whenever s{f) = t(g) (so that on paper the preferred direction of an arrow is from 
right to left!). Arrow composition is associative whenever defined, and each i(x) acts 
as an identity under this composition operation. The class of all arrows from x to y 
in a category C is sometimes written as Homc(x,y), or simply as Hom(x,y), when C 
is unambiguous. A category is called small if both Co and Ci are sets (otherwise, a 
category is called large), and locally small if for each x,y G Co the class Homc(x,y) 
is a set (although Ci itself may be a proper class). All categories used in this book 
are locally small (though not necessarily small). Here are some examples. 

• Sets has sets as objects and functions as arrows. Sets is a large category, but it 
follows from the ZF axioms for set theory that it is locally small (as promised). 

• Any set X with a preorder < (and hence any poset) defines a category X where 
Xo=X and Hom(x,y) contains a unique arrow iff x < y, being empty otherwise. 

• A small category in which each arrow is invertible is called a groupoid', see 
§C. 16. In particular, any group may be seen as a category with just a single object. 
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Categories come with an intrinsic notion of isomorphism: one calls two objects 
G Co isomorphic, written x = y, when there exist arrows f :x^y and g :y ^ x 
such that fg = idy and gf = id;^. For example, two sets are in bijective correspon¬ 
dence iff they are isomorphic objects in Sets, two topological spaces are homeomor- 
phic iff they are isomorphic in the category of topological spaces and continuous 
maps, and two C*-algebras are isomorphic in the sense of Definition C.2 iff they are 
isomorphic in CA, where we define the following useful categories of C*-algebras: 

• CA, which has C*-algebras as objects and homomorphims as arrows. 

• CAm, again having C*-algebras as objects, but now with nondegenerate homo¬ 
morphims into the multiplier algebra as arrows (cf. Theorem C.76 etc.). 

• CAn, with C*-algebras and nondegenerate homomorphims (cf. Definition C.42). 

• CAi, with unital C*-algebras as objects and unital homomorphims as arrows. 

• CCA, CCAm, CCAn, and CCAi, i.e. the full subcategories of CA, CAm, CAn, 
and CCAi, respectively, in which the objects are commutative C*-algebras. 

Here the notion of a subcategory C C D is the obvious one, i.e. Cq C Dq, Ci C Di, 
and C is a category by itself (in particular, C is closed under the maps s,t,i,m). We 
say that C is a full subcategory of D if Homc(x,y) = HomD(x,y) for all x,y G Cq. 

We now define the “canonical” maps between categories (which, in the spirit of 
the subject, are often more important than the underlying categories themselves!). 

Definition E.2. Let C and D be categories. A covariant functor or simply functor 
E : C —^ D consists of a pair of maps Ft: C,- —D,-, i — 0,1, such that: 


iu O-pQ = F\ oic; 

(E.7) 

sdoFi =Foosc; 

(E.8) 

to °Fi = Footc', 

(E.9) 

Fl ifg) = Fi {f)Fi (g) (f,g G C 2 ), 

(E.IO) 

where /q : Dq —7 Di is the inclusion map in D, etc. 

A contravariant functor F : C ^ D is a pair Fi: C, D/, 

i = 0,1, such that: 

'D OpQ = P '1 Otc; 

(E.ll) 

sdoP’i =Footc; 

(E.12) 

fDOP’i =Foosc; 

(E.l 3) 

Fl ifg) = Fl {g)Fi (/) (/, g G C 2 ). 

(E.14) 


It follows that Fq is determined by Fi, since i is injective, but nonetheless it is useful 
to keep them apart. The use of contravariant functors may be avoided by introducing 
the opposite category C°p of C, which has the same objects and arrows as C, but the 
latter going in the opposite direction (i.e. sc°p = tc, etc.). For example, if C = X is 
a preorder, in the category X°P the partial order is reversed. A contravariant functor 
E : C —D is then obviously the same thing as a covariant functor T : C —> D°P, or, 
equivalently, F : C°P —D. This is very important for us, because Gelfand duality is 
based on contravariant functors and hence on opposite categories; see below. 
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Definition E.3. A natural transformation between two functors F -.C^D and G : 
C —D f that are either both covariant or both contravariant) is a map T ; Cq —Di, 
written x i—>■ lx, such that sd('I») = -fbW and toftx) = Go{x )— in other words, T is 
a collection of maps Xx : Fq(x) —>■ Gq{x) indexed by x G Cq— such that the following 
diagram commutes for all arrows f : x ^y: 


Fo{x) 

Fo{y) 


^ Goix) 

^ GoW 


(E.15) 


Two functors F and G as above are called naturally isomorphic, written F = G, 
when there exists a natural transformation X between them for which all arrows Xx 
are invertible (i.e., are isomorphisms). 

It follows that if F and G are naturally isomorphic, then Fq{x) = Go{x) for all xGCq, 
but this condition is not sufficient by itself to render F and G naturally isomorphic, 
for the isomorphisms Xx between E(x) and G{x) must be compatible with the arrows, 
as expressed by the diagram in the above definition; this is even the whole point! 

Definition E.3 clarifies the idea that the double dual V** of any finite-dimensional 
vector space V is isomorphic to V in a “natural” way: namely, the functor ** from 
the category of finite-dimensional vector spaces (over C) to itself (with linear maps 
as arrows) is naturally isomorphic to the identity functor through the natural trans¬ 
formation whose components Xv - V ^ V** are given by the “Gelfand transform” 
V i-G- V, where v(9) = 9(v) for 9 G V*. In contrast, the dual V* is isomorphic to V in 
an “unnatural” way, in that any isomorphism depends on the choice of a basis. 

Definition E.4. 7vvo categories C,D are called equivalent, written C ~ D, when 
there exist (covariant) functors F : C ^ D and G : D —^ C such that F oG = idp 
and GoF = id^. Similarly, C and D are called dual when there exist contravariant 
functors with the same properties, i.e., if C and D°P are equivalent. 

Here idc is the identity functor on C, etc. Spelling out what this means, using Defi¬ 
nition E.3, yields the commutative diagrams 


GooEo(x) —^ X 


Giofi(/)| 

GooEo(y) 



(E.16) 


for all /: X —y in Cl, where each Xx is invertible, and also for all /': x^ —y' in Di, 


EooGo(x') y 

Fi°Gi(f)\ f 

■4^ 4' 

EooGo(y') y' 


(E.17) 
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We are now in a position to give a categorical (re)formulation of Gelfand duality. 
Further to the categories of commutative C*-algebras CCAi, CCAn, and CCAm 
defined earlier in this section, this involves the following categories of spaces: 

• CH, i.e. the category of compact Hausdorff spaces and continuous maps. 

• LCHp, with locally compact Hausdorff spaces and proper continuous maps. 

• LCH, the category of locally compact Hausdorff spaces and continuous maps. 

Theorem E.5. There are categorical equivalences (i.e., dualities if ‘op’ is omitted): 


CCAi ^ CH°P; 

(E.l 8) 

CCAn - LCHp°P; 

(E.19) 

CCAm - LCH°P. 

(E.20) 


Proof. In the proof of Theorem C.23, the maps ev^- provide a natural isomorphism 
between the functors idcH and LoC from CH to itself, whilst the maps Ga perform 
the same job for the functors idccAi and CoZ from CCAi to itself; the naturality 
properties (C.40) and (C.41) precisely express commutativity of the above diagrams. 
Likewise for the other two cases, which restate Theorems C.45 and C.76. □ 

Similarly, Stone’s Theorem D.5 is best seen categorically, stating that the category 
of Boolean lattices (with homomorphisms preserving V, A, and _L as arrows) is 
dual to the category of Stone spaces (as a full subcategory of CH). With hindsight. 
Stone’s Theorem (which predated category theory) was the first such duality result. 
Definition E.4 may be strengthened by replacing the isomorphisms 


FoG = ido; 
GoF = idc, 


by equalities, i.e.. 


FoG = ido; 
GoF = idc. 


(E.21) 

(E.22) 


(E.23) 

(E.24) 


In that case, the categories C and D are called isomorphic. However, this is less rele¬ 
vant than the following weakening of the first two conditions, called an adjunction: 

Definition E.6. Two functors F :C^D and G: D — Cform an adjoint pair if there 
exist natural transformations rj from idc to GoF (called the unit of the adjunction), 
and efromFoG to ido (called the counit of the adjunction), such that the following 
diagrams of natural transformarions (i.e. the triangle identities) commute: 


F 


Foot? 


F FGF 



G 


TjoGo 


> GFG 



(E.25) 


We write F -\G, and say that F is left-adjoint to G, or that G is right-adjoint to F. 
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It is easy to see that if they exist, left or right adjoints are unique up to isomorphism. 
If we assume that C is locally small (in that all classes Homc(x,y) and HomD(x',y) 
are sets), then the above definition states that the functors HomD(F(—),—) and 
Homc(—,G(—)), both defined from C°P X D to Sets, are naturally isomorphic. In 
other words, for each x G Co and y' G Dq, we have a bijection: 

HomD(F(x),y) =Homc(x,G(y)) (E.26) 

that is natural in both variables x and y' (i.e., for each y' G Dq, the functors 
HomD(E(—),y) and Homc(— ,G(y)) from C°P to Sets are naturally isomorphic, 
and for each x G Co, the functors HomD(E(x),—) and Homc(x,G(—)) from D to 
Sets are naturally isomorphic). Indeed, the natural bijection (E.26) is given by 

(^F(x) ^ yj GF{x) G(y)^ ; (E.27) 

(x ^ G(y)) ^ (^E(x) ^ FG{j') % y^ . (E.28) 

This may even be interesting if C = D, and hence E : C —> C and G : C C. Eor 
example, a Heyting algebra H (seen as a posetal category) is home to an adjunction 

(-)Ay H y-^(-), (E.29) 

for any fixed y G H, where, writing (E.29) as E H G as usual, we put 

Eo(x)=xAy; (E.30) 

Go{x) = {y-*x). (E.31) 

Definition E.4 of an equivalence of categories involves an adjunction E H G 
whose unit and counit are both natural isomorphisms, as opposed to mere natural 
transformations, as in Definition E.6 of an adjunction. In that case, G is an inverse 
to E up to isomorphism of objects (which still falls short of an exact inverse, which 
as mentioned would lead to the less important notion of isomorphism of categories). 
But even for an adjunction, one may regard G as a weak kind of inverse to E, which 
allows one to move between categories in the direction opposite to E. 

Other than equivalences of categories, the traditional examples of adjunctions 
yield left adjoints to so-called forgetful functors, which strip some class of math¬ 
ematical objects of (some of) its structure. Eor example, if Grp is the category of 
groups and homomorphisms, the forgetful functor G ; Grp —Sets sends a given 
group to its underlying set; this functor has a left adoint E ; Sets -A Grp that assigns 
the free group on a set X to X. Similarly for vector spaces. Boolean algebras, etc. 

We now move on to limits and colimits, whose general definition we precede by 
a few special cases. These abstract the corresponding constructions from Sets (and 
hence pave the way for topos theory, which resembles set theory in various ways), 
so that for the right “feeling” we switch to labeling objects in a category by capitals. 
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Definition E.7. Let C be a category (for simplicity assumed to be locally small). 

• A product of a pairX,Y G Cq is an object X x F G Cq, with arrows p\ :X xY —>■ 
X and P 2 .XxY^Y, such that for all arrows q\ : Z —>■ X and q 2 - Z ^Y, there 
is a unique arrow Z ^ X xY making the following diagram commute: 


X 4 



(E.32) 


If each pair of objects in Cq has a product, C is said to have binary products. 
The next part of the definition relies on the following fact about products, which 
is easy to prove: given f \ X ^X' and g\Y —> F' in Ci, there is a unique arrow 


f xg-.XxY ^X' xY' 
such that the following diagram commutes: 


(E.33) 


X G- 


XxY — 
U'.fxg 


-X Y 


(E.34) 


X' X' X Y' — ^ F' 


• A function space or exponential of a pair Y,Z G Cq in a category C with binary 
products is an object Zf G Co (which in Sets is the set of all functions g\Y -gZ) 
with an evaluation map on \ Zj^ xY — >■ Z (which in Sets is (g,y) >—>■ g(y) G Z), 
such that for each f : X xY ^ Z there is a unique arrow f : X ^ Z^ (which in 
Sets is f(x)(y) = f(x,y)) making the following diagram commute: 


XxY - - —^ Z 



(E.35) 


• A terminal object is an object 1 G Cq such that for each X G Co, there is a unique 
arrow X —> 1 (in other words, Homc(X, 1 ) contains precisely one element). 

• A category C having a terminal object, binary products, and function spaces for 
all objects, is called cartesian closed. 

The relationship between products and function spaces is just the adjunction 

(-)xFH(-f, (E.36) 

for each F G Co, where the left-hand side denotes the following functor: 
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(-) X F ; C ^ C; (E.37) 

X X X F; (E.38) 

(/ : X X') (/ X idr : X X F X' X F). (E.39) 

Here / X idy is a special case of (E.33), whilst the right-hand side of (E.36) is 

(-)’':C-^C; (E.40) 

Zr-^Z^; (E.41) 

{g-.Z^Z')^{^w.Z^ (E.42) 

where the arrow glJev is defined as in the text above (E.35), in which we substitute 
X ^ ZJ ,Z ^ Z', and / ~^ g o ev; note that the latter is an arrow Z^ x F —^ Z'. 

As in (E.26), the adjunction (E.36) gives a bijection 

Home (X X F,Z) ^ Home {X,Z^), (E.43) 


which of course is precisely the correspondence / O /; the counit of (E.36) is e = ev 
(i.e., its component at Z is ev : Z^ x F —Z), whereas the unit (at Z) is the map 
f :X ^ Z^ corresponding to / : 2f x F —^ Z on the choices X Z, Z ZxY, and 
f :ZxY ZxY being the identity arrow. 

The following construction, generalizing binary products, is very important. 

Definition E.8. The pullback of two arrows f :X ^Z and g:Y ^Z consists of two 
arrows p : P ^ X and q: P ^Y such that the following square commutes, and has 
the universal property that for any arrows p' \P' ^X and q' \P' ^Y with fp' = gef, 
there is a unique arrow h: P' P such that the entire diagram commutes: 



(E.44) 


One says that q is a pullback of f over g, whilst p is a pullback of g over f. 

In the category Sets, pullbacks coincide with fibered products, that is, 

P = XxzY = {(x,y)GXxF|/(x)=g(y)}, (E.45) 

where p and q are the projections on the first and the second coordinate, respectively. 
In particular, taking Z to be a singleton reproduces binary products as special cases 
of pullbacks. This can be done in all categories C with a terminal object. 

At last, we turn to limits and colimits in a category. A (finite) diagram in a 
category C is a functor D ; J —C, where J is some (finite) category. In case that 
J is empty, we say that there is a unique functor D into C; even this is interesting! 
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The diagram just consisting of two objects X,Y € Co corresponds to7o = {0,1} with 
0^1 and only identity arrows. The next case is an arrow f :X -^Y, obtained from 
Jo = {0,1} as a poset, i.e., 0 < 1 . Finally, consider Jq = (0,1,2} with nontrivial 
arrows 0—^1 and 2 — 1 ; this defines a diagram 

Y^zt^X. (E.46) 

For any C G Co, let Dc : J —?• C be the constant functor that sends all j G Jq to C, and 
all arrows in J to idc- A cone over a diagram D : J —C is an object C G Co (called 
the vertex of the cone) with a natural transformation from Dc to D, i.e., a collection 
of arrows cj :C Dj = Do{j) indexed by j G Jq, such that for each arrow x ■ j ^ k 
in Ji, with induced arrow Ji (x) '■ Dj D^,, the following triangle commutes: 

C Dk 

^ (E.47) 

Dj 

A cone over the empty diagram is just a loose object C. A cone over our two-object 
diagram without arrows is A 3— C —T, whereas a cone over (E.46) is a commuting 
square as in (E.44). A limit of a diagram D : J —C is a universal cone over D, i.e., 
a cone (C, {cj : C —> Z)j}jgjg) such that for any other cone (C', jc'-: C' => Dj}j) for 
the same diagram there is a unique arrow h:C' ^ C such that 

Cjoh = c)U^h)- (E.48) 

A more elegant way of phrasing this is via the category Cone(D), whose objects are 
cones over D, and whose arrows are arrows /z: C' —> C in Ci satisfying (E.48). A 
limit of D, then, is just a terminal object in Cone(D). Either way, it is clear from 
the universal property that any two limits of a given diagram must be isomorphic. 
Despite this lack of uniqueness, the typical notation for a limit of a diagram D is 

C = \^jDj. (E.49) 

It should now be clear that a terminal object is a limit over the unique diagram 
over the empty category, a (binary) product is a limit over a two-object diagram 
obtained from Jq = (0,1} with only identities, and finally a pullback is a limit over 
the diagram (E.46) obtained via Jq = (0,1} seen as a posetal category. 

Especially in connection with topos theory, the following fact is quite useful: 

Proposition E.9. A category has all finite limits (i.e. limits based on finite diagrams) 
iff it has all pullbacks and has a terminal object. 

Replacing C by its opposite category C°P, we obtain the colimit C = ^fff^jDj of a 
diagram, which is defined as a limit of the same diagram seen in C°P, so that in 
all definitions all arrows are reversed. Thus terminal objects are replaced by initial 
objects, products become coproducts, and pullbacks are turned into pushouts. 
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E.2 Toposes and functor categories 

The last ingredient we need for the definition of a topos is a categorical abstraction 
of subsets X CY and their characteristic functions lx, i.e. a subobject classifier. 

Definition E.IO. 1. An arrow m : X ^ Y in any category C is a monomorphism 
(or briefiy a monoj if for any g,h : Z ^ X, the equality mg = mh implies g = h. 
Similarly, abstracting surjectivity rather than injectivity, e '.X ^Y is called an 
epimorphism or an epi if for any g,h\Y —>■ Z, the equality ge = he implies g = h. 

2. Two monomorphisms m : X ^ Y and m' : X' ^ Y are equivalent if there is a 
(necessarily unique) isomorphism h: X ^ X' such that m — m'h. 

3. A subobject of Y is an equivalence class of monomorphisms m : X ^ Y. The 
class of all subobjects ofY (which is not necessarily a set) is called Sub(F). 

4. A subobject classifier in a category C is a mono t :1 ^ Q such that all pullbacks 
oft exist in C, and for any mono m:X ^Y there is a unique arrow Xm-Y^Q 
(called the characteristic function or classifying map of m, or, loosely, ofX) 
that makes the following diagram a pullback (and hence makes it commutative): 

X 1 

”1 I' 

Y a 

It follows that the object 1 is terminal in C (which of course constrains C to have a 
terminal object in the first place); Q is often called the truth object of C. 

Proposition E.ll. If a locally small category C has a subobject classifier, then for 
any Y G Co, the class Sub(y) is a set, and the map mi-^ Xm induces a bijection 

Sub(F)^Homc(y,f2). (E.51) 

Proof It follows from the definition of a pullback that equivalent monos m:X ^Y 
and m! :X' ^Y yield the same arrow Xm, so that the map m^^ Xm from monos to 
arrows passes to equivalence classes, i.e., we have a map [m] Xm- The universal 
part in the definition of a pullback (i.e., monos with the same classifying maps are 
isomorphic) makes the latter map injective, whereas surjectivity follows from the 
general fact that the pullback (namely m) of any arrow (namely x) over a mono 
(namely t) is a mono, where we see (E.50) as a pullback/or given x and t. □ 

Eor example, in Sets a mono is an injective function (and an epi is surjective), so 
that any mono into Y originates in some set that is isomorphic to some subset of Y. 
Any singleton 1 = * = {0} serves as a terminal object, and Sets has a truth object 

12=2 = {0,1}, (E.52) 

with subobject classifier f(*) = 1; ifX C T, and m is the inclusion map, then Xm = lx 
is just the characteristic function ofX, and Sub(2f) = 3^{X) is the power set ofX. 


T^txLLltXLMtXLtljtUaJ. T^lLy-A-LC-A. 



E.2 Toposes and functor categories 


815 


The haunting name “truth object” for Q. might explain some of the fascination 
logicians and quantum physicists have felt for topos theory, which we now define: 

Definition E.12. A topos is a cartesian closed category (i.e., having a terminal ob¬ 
ject, binary products, and function spaces) with pullbacks and a subobject classifier. 

More precisely, this defines an elementary topos. It follows from Proposition E.9 
that a topos has all finite limits, and it can be shown that it also has all finite colimits. 

It should be clear that Sets is a topos; indeed, in our presentation the presence of 
the necessary ingredients of a topos within Sets partly motivated these ingredients. 
More generally, all toposes relevant to this book are of the following sort. We first 
note that for any two categories C and D we obtain a new category [C, D] whose 
objects are (covariant) functors from C to D, and whose arrows are natural transfor¬ 
mations between such functors. It is often natural to consider contravariant functors, 
giving the category [C°P, D]. If D = Sets, such functors are called presheaves on C. 
The category [C°P,Sets] is often denoted by Sets*" An important special case is 

C = ff{X), (E.53) 

1. e. the topology of some space X (seen as a posetal category); with slight abuse of 
notation, functors E : i^(X)°P —> Sets are called presheaves on X. 

Theorem E.13. For any small category C, the category [C°P, Sets] is a topos. 

Proof. We focus on the subobject classifier; the remainder following from the fact 
that limits in [C°P,Sets] (including pullbacks and function spaces) are computed 
pointwise, i.e., if D : J —[C°P,Sets] is a diagram, then for each C G Ci we obtain 
a diagram Dc in Sets defined by Dc{j) = D{j){C). Since Sets has all limits, we 
obtain limits '^c for each Dq. These form a single functor which is a limit of D. 

The simplest example is the terminal object in [C°P,Sets], which comes out as 
lo(C) = * for each C G Ci, where * is some arbitrary (but fixed) singleton. 

To discuss the truth object in [C°P,Sets], we need a few definitions. 

Definition E.14. 1. In any small category C, a sieve on an object C G Co is a set S 
of arrows with target C such that if f G S, then fhGS whenever fh is defined. 

• • (C) 

2. The maximal sieve 5max on C consists of all arrows with target C. 

3. The pullback sieve f*S (on D) over an arrow f : D ^ C consists of all arrows 

f*S={h:X-^D\fhGS}. (E.54) 

4. We denote the set of all sieves on C by Sieves(C). 

Clearly, if id^ G S, then S = We will show that the truth object in [C°P, Sets] is 

f2o(C) = Sieves (C); (E.55) 

f2i (/)=/*. (E.56) 
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The subobject classifier in [C°p, Sets], then, is the natural transformation 


(E.57) 

fc :lo(C)^ Sieves(C); (E.58) 

fc(*)=5Ec. (E.59) 

To understand this, we need the Yoneda Lemma E.15 below. In preparation, for any 
(fixed) C G Co, we define a functor yc ■ C°P —Sets by 

(yc)o(£>) = Home (Z),C); (E.60) 

(yc)i(D4D'^ =(g^gf), (E.61) 

the latter being a map from Homc(£>^C) to Homc(£>,C). This is often written as 

yc = Homc(-,C), (E.62) 


and the functors yc are called representable presheaves . Since f :C ^ C' induces 
a natural transformation yc — t yc in the obvious way, i.e., its component Td at D is 
the map g^ fg from Homc(£>,C) to Homc(£>,C'), the map C ^ yc extends to a 
functor y : [C°P, Sets], called the Yoneda embedding. 

Lemma E.15. For any F G [C°P, Sets], any D G Co, andx G Fo(C), the map 

: Home (D,C) -t Lb(D); (E.63) 

(d4c) ^Ei(/)(x), (E.64) 

where Fi{f) duly maps Fq{C) to Fq{D), forms the component at D of a natural 
transformation Z^^'l from yc to F, and the ensuing map x i—>■ gives a bijection 

Fo{C) ^ Hom[cop_sets](yc,E). (E.65) 

Recall that by definition of a functor category, the right-hand side of (E.65) consists 
precisely of the set Nat(yc 5 E) of natural transformations fromyc to F. 

Lemma E.16. For any C G Cq and S G Sieves(C), the presheaf defined by 

(D) = Home {D, C) n 5; (E.66) 

) (d 4 D') ={gf\gG ^ {D')}, (E.67) 

defines a subobject m : X^^^ —>■ yc, and the ensuing map S i— X^^'i yields a bijection 

Sieves(C)4Sub(yc)- (E.68) 

More generally, if X and Y (generalizing and yc above) are presheaves on C 
with Xo{D) C Yq(D) for all D, then the equivalence class of 7f is a subobject of Y. 
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The proof of Lemma E.16 below uses the converse fact: any subobject of Y has a 
representative X' for which X' is a subfunctor of Y, i.e., Xq{D) C io(^) for all D, 
and X[ is the restriction of Yi, as in (E.70). below. To see this, suppose one has a 
mono m :X ^Y, so that each component mo : Xq{C) To(D) of m is an injective 
function. We now define a presheaf X' on C by 

X'^{D) = moiXoiD)) C Yo{D)- (E.69) 

(/) = f'l (/)|z'(D') (E.70) 

Eurthermore, we define a natural transformation m! \X' ^ Y, whose components 
m'jj : Xq{D) — Yo{D) are given by set-theoretic inclusion. The natural transformation 
h:X ^ X', defined through its components ho = mo (where is mo, but seen as 
a map from Xq (D) to Xq(D) rather than to To (D)) then renders m and m' isomorphic. 

Proof. The map S ^-7> X^^^ has an inverse X i— Sx, where Sx G Sieves(C) is given by 

= U ^o(£>). □ 

Combining (E.55) - (E.56) with Lemma E.15 applied to E = X2, gives 


Hom[cop_sets](yc,f2) ^ Sieves(C), (E.71) 

so that Lemma E.16 yields a bijective correspondence between arrows from yc to 
Q. as defined in (E.55) - (E.56), and subobjects of yc- At D, diagram (E.50) is 

Homc(£>,C)n5 -* 

<D ) (E.72) 

Homc(£>,C) > Sieves(Z)) 

where mo is the inclusion map, to{*) = 5max(T*), and [Xirfoif) = f*S. Commuta¬ 
tivity of this diagram follows from the fact that if / G Homc(£>,C) n5, then f*S 
is the maximal sieve on D, as trivially follows from the definition of a sieve. The 
pullback condition is then easy to verify from Lemma E.16. 

If we replace yc by any presheaf T, the classifying map Xm is given by 

(Zm)D ; To (D)Sieves (D); (E.73) 

x^{f-.D'^D\Y, (/)(x) G Ao(Z)')}, (E.74) 

noting that Yff) maps To(D) into To(Z), and Ao(Z) C To(Z), since we assume that A 
represents a subobject of T such that Xq{D) C To(D). This generalizes the previous 
case where T = yc- To show that Xm is unique, we write down (E.50) at D\ 
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Xo{D) 

A 

YoiD) 


r 

^ Sieves(D) 


(E.75) 


Also, the condition that Xm be a natural transformation implies that the diagram 

YoiD) Sieves(D) 

n(/)| |.r (E.76) 

YoiD') Sieves(D') 


commutes for any f : D' ^ D. Then (E.75) with D D' implies that for any y G 
YoiD'), we have y G XoiD') iff iXm)D'{y) = ‘5max(77'). In particular, we may take 
y = Y\if)ix) forxG Top), so that Ti(/)(x) GXoiD') iffXo/(Ti(/)(x)) =5maxp')- 
Commutativity of the diagram (E.76) gives iXm)D' °Yiif) = f* o iXm)D, so that 
Ti(/)(x) G XoiD') iff f*iiXm)Dix)) = 5niaxP0’ which in turn is the case if and 
only if / G iXm)Dix)- Hence we finally obtain 

/ G {Xm)Dix) iff Ti(/)(x) G XoiD'), (E.77) 

which is the definition (E.73) - (E.74) of Xmi and renders it unique (given m). 

Einally, the universal property of (E.50) follows from Proposition E.ll: since 
if X' in (E.50) is like P' in (E.44), then m' : X' ^ Y is the pullback of x over t, 
and hence m! must be equivalent to m. But we know (cf. Definition E.10.2) that an 
equivalence between mono’s is unique. This closes the proof of Theorem E.13. □ 

Refining presheaves, we also introduce the category Sh(A) of sheaves on X, 
which is the full subcategory of [^(A)°P, Sets] defined by the following condition. 

Definition E. 17. A presheaf F : ^ Sets on X is a sheaf if for any open 

U G ffiX), any open cover U = '^jUj ofU, and any family {sj G FoiUj)} such that 

Fi iUjk < Uj)isj) = Fi iUjk < Uk)isk), (E.78) 

for all j,k, there is a unique s G E(t/) such that sj = FiUj < U)is) for all j. 

Here U jk = Uj fl Uk, and F\iV <W)\FoiW)^FoiV)\s the arrow part of the functor 
E. If E is a sheaf on X, then for each open U = Uj^jUj, it has the continuity property 

FoiU)=\^jFoiUj), (E.79) 

where the limit is defined with respect to the diagram D ; J ^ Sets where J is the 
posetal category whose objects are j G J, and (/, j) GJ xJ provided Uij f 0, ordered 
by i < ii,j) and j < (i,y), with D(/) = E(t/,) and Dii,j) = FiUij), etc. 

A key example of a sheaf on X is the sheaf of continuous functions, where 

Eo(t/) =C(t/,]R). (E.80) 
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If t/ < y, then the associated map Fi{U < V) : C(y,]R) —>■ C{U,M.) is simply 
given by restriction. Sheaves may be defined far more generally (as done by 
Grothendieck), namely on a site (which is a category equipped with a so-called 
Grothendieck topology), but sheaves on a space are all we need in this book. 

Analogously to Theorem E.13, Sh(X) is a topos, whose truth object is the sheaf 

f2o(t/) = (E.81) 

ni{U <V) = {-)r\U, (E.82) 

i.e., ifWG ^{V), then Q.\ (IT) = IT fit/ G ffiJJ). With the terminal object in Sh(X) 
being borrowed from [^(X), Sets], i.e., lo(t/) = *, its subobject classifier is 


tu{*) = U. (E.83) 

In fact, let X be a poset equipped with its intrinsic Alexandrov topology, whose 
open sets are the upper sets, i.e. those U QX for which x €U and x <y implies 
y GU. Examples of opens are up-sets U = \x = {yGX\x< y}, which form a basis 
of the Alexandrov topology; in fact, fx is the smallest open set containing x. Eor any 
X GX,we write Upper(x) for the set of all upper sets containing 'fx. 

Proposition E.18. IfX is a poset, the category [X,Sets] of functors F : X ^ Sets 
(where X is seen as a category defined by the underlying poset) is isomorphic to the 
category Sh(X) of sheaves on X (equipped with the Alexandrov topology), i.e., 

[X.Sets] ?5^Sh(X). (E.84) 

Note that [X, Sets] consists of presheaves on X°P (in which x < y iff y < x in X). 

Proof. This isomorphism is given by mapping a functor F): X ^ Sets to a sheaf 
F : ff{X)°P ^ Sets, by defining the latter on a basis of the Alexandrov topology as 

E(tx)=E(x), (E.85) 

extended to general Alexandrov opens by (E.79). Vice versa, a sheaf F on X imme¬ 
diately defines E by reading (E.85) from right to left. □ 

Corollary E.19. IfX is a poset, the subobject classifier in [X, Sets] is given by 

f2o(x) = Upper(x); (E.86) 

i2i(x<y) = (-)n(ty); (E.87) 

G{*) = tx. (E.88) 

Proof. If C is a poset X, then a sieve on x G X is a lower subset of j, x (i.e., if y G 5 

then y < X, and if also z <y, then z G S). Recalling the comment after (E.84), the 
claim then follows from (E.55) - (E.59). Alternatively, using Proposition E.18, the 
claim also follows from (E.86) - (E.88). □ 
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E.3 Subobjects and Heyting algebras in a topos 


There are numerous connections between topos theory and intuitionistic logic, most 
of which generalize links between set theory and classical logic. The beginning of 
algebraic logic was Boole’s work, which in modern parlance structured the power 
set of any set as a Boolean lattice, and hence provided a semantics for classi¬ 

cal propositional logic, cf. §D.2. From a categorical view, ^{X) is the set Sub(X), 
cf. (E.52) and subsequent text. This generalizes to any topos in which Sub(2f) is a 
set (rather than a proper class), except for the decisive difference that Sub(2f) is no 
longer a Boolean lattice but a Heyting algebra, making topos logic intuitionistic. 

Proposition E.20. For any object X in a locally small topos T, the set Sub(2f) of 
subobjects ofX is a Heyting algebra with respect to the partial ordering < defined 
by[m\U^X\<[m'\V^X\ iff there is h :U such that m'h = m. 

It is easy to show from Definition E.10.1 that < is well defined, and since it is 
defined “on the nose”, i.e., at the level of representatives of the equivalence classes 
in question, in what follows we will use mono’s rather than their equivalence classes. 

Proof Since we only need this result for presheaf toposes, we just list the pertinent 
operations, and omit the verification of the details (which is left to the reader). 

• The bottom element _L of Sub(2f) is the unique arrow 0 — ^ X, where 0 is the 
initial object in T (any category with finite colimits has such an object, denoted 
by 0, whose defining property is that for any X there is a unique arrow 0 —X; as 
the notation suggests, in Sets the empty set is an initial object). 

• The top element T of Sub(2f) is the identity arrow idx at X. 

• The inf of m:U ^X and m' :V ^ X is their pullback, i.e., abusing notation. 


UAV —^ U 


‘t 



(E.89) 


so that the desired arrow U AV ^ X is mp = m! q (which is indeed a mono). 

• The sup of m : U ^ X and m! :V ^ X is more complicated. In any topos T, 
arrows have an epi-mono factorization / = me, where m is mono (and as such is 
unique up to isomorphism), called the image of /, and e is epi. Eurthermore, T 
has finite colimits including coproducts. Reversing all arrows in (E.32) gives 

X 

(E.90) 

The sup UyV, then, is “the” image of the arrow / in this diagram. 
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• Finally, implication is defined in terms of an equalizer, which may be con¬ 
structed as a pullback, as follows: taking Y = Z — X and q\ — q 2 = idx in 
(E.32) gives a unique arrow Ax '.X ^ X xX, called the diagonal', in Sets it is 
^xix) = {x,x). Furthermore, if we have two arrows f,g :X ^Y, taking Z X, 
X Y, qi f, and ^2 in (E.32) gives a unique arrow {f,g) : X ^ Y xY, 
which in Sets is of course given by (f,g)(x) = (f(x),g(x)). 

The equalizer of / and g, then, is the arrow e : E X in the pullback 


(E.91) 

X YxY. 

The equalizer indeed deserves its name, because the map p equals both fe and ge; 
in Sets, E CX may simply be taken to be the subset on which / and g coincide. 

We return to our monos m:U ^ X and m' :V ^ X, with inf f/ A V: the mono 
(U —^y)—>^Xis the equalizer of the classifying maps Xu, Xuav '.X ^ Q. □ 

Recall that in Sets we may identify Sub(X) with the power set ^{X), so that 

_L = 0; (E.92) 

T=X. (E.93) 

Eor U,V C X, the above constructions reduce to the well-known expressions 

t/<y iff t/cy; 

UAV = ur\V', 

UVV = UUV', 

u —^V = u‘^uv, 

where for comparison below we may rewrite the right-hand side of (E.97) as 

u^uv = {xex \xeu^xev}. (E.98) 

The (derived) expression (D.12) for negation then equates -■ with complementation: 

-.U = U‘= = {xGX\x(^U}. (E.99) 

In a presheaf topos [C°P,Sets], one obtains similar expressions for _L and T, viz. 

_Lo(C)=0; (E.lOO) 

To(C)=X(C), (E.lOl) 

where the functor _L is the initial object in [C°P, Sets]. The logical connectives resem¬ 
ble the set-theoretic case, too, except for the last ones: if U and V are representatives 
of subobjects of X such that Uq{C) C Xo{C) and Vo{C) C Xq{C), we have: 


(E.94) 

(E.95) 

(E.96) 

(E.97) 
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t/ < y iff Uo{C) C yo(C) for all C; 

(t/Ay)o(c) = t/o(c)nyo(c); 

(t/vy)o(c) = t/o(c)uyo(c); 

{U y)o(C) = {xG XoiC) I VD 4 C : Xi{f){x) G t/o(£>) 


nt/o(C) = {x G Xo(C) I VD 4 C: (/)(x) G Uo{D) 


(E.102) 

(E.103) 

(E.104) 

■Xi{f){x)€Vo{D)y, 

(E.105) 

■ xyf){x)iVo{D)}. 

(E.106) 


This Heyting algebra is Boolean iff - 


U = U for each U, so we are interested in 

(E.107) 


nt/o(C) = {x G Xo{C) I VD 4 C3E 4 D ; (g/)(x) G Uo{E)}. 


It can be shown that Sub(X) is Boolean for each object X iff Sub(f2) is Boolean. In 
order to settle this, we specialize (E.107) to subfunctors m:U ^ Q, which gives 

^^Uo{C) = {SG Sieves(C) | VD 4 C3E 4 D : {gf)*S G Uo{E)}. (E.108) 

Eor example, if C = is a posetal category, this expression becomes 

-•-•Uo{x) = {S G Upper(x) | Vy > x3z >y:Sr\ (fz) G Uo{z)}, (E.109) 

which is clearly an additional property of 5 G Uo{x)', examples abound in Chapter 
12. Thus the (propositional) logic of Sub(2f) may be genuinely intuitionistic (and 
given our examples, this conclusion especially applies to quantum logic). 

Although X is an object in a topos, Sub(2f) is a Heyting algebra in ordinary set 
theory. This is called an external description of X. Alternatively, one may study a 
topos using so-called internal reasoning. We will develop the logical foundation of 
internal reasoning (at least to some extent) in the next section, and for the moment 
just look at a special example, namely Heyting algebras within some given topos. 

Definition E.21. Let T be a topos (more generally, a category with all finite limits). 

• A preorder on an object X G Tq in T is a mono m< : R ^ X xX for which: 

1. The diagrammatic version of reflexivity (in set theory: x <x ) holds, as fol¬ 
lows. The diagonal Ax=A '.X ^ X xX factors through m<, i.e. there is an 
arrow X -G R such that the following diagram commutes: 



(E.llO) 


2. The diagrammatic version of transitivity (in set theory: x<y and y <z imply 
X < z) holds, as follows. First, define P as the pullback 
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R 


/)lom< 


R 


piom< 


(E.lll) 


where pi,p2 ■ X xX X are the arrows in (E.32), and p,q are defined as in 
(E.44). The arrows p \ o m< o p \ P —^X and p2om<coq : P X, then yield an 
arrow P :X ^ X xX via (E.32), which must factor through m<, too. 

• A partial order on X is a preorder that is antisymmetric (in set theory: x <y 
and y < X imply x = y), in the following sense. First, define the twist map 


T-.XxX^XxX 


(E.112) 


by taking Z-^XxX,Y-^X,qi-^P 2 and q 2 Pi in (E.32); in set theory, this 
would be x{x,y) = (y,x). This enables us to reverse the order by defining a monic 


m> : R ^ X xX-, 
m> = Tom<, 


with associated pullback 


P' 


R 


q' “< 


R 


X 


(E.113) 

(E.114) 


(E.115) 


The arrow m< o p' = m> oq': P' ^X, then, must factor through A :X ^ X xX. 
• A lattice in T is a partial order on some object X for which there are arrows 

A:XxX^X-, (E.116) 

V.XxX^X, (E.117) 


such that: 


1. The arrow m< : R ^ X x X is an equalizer of the arrows A : X xX ^ X and 
Pi :X xX -A X (in set theory this expresses the property x<y iff x Ay = x). 

2. One has AoA= idx and \/ oA = idx (i.e., xAx = x andxWx = x). 

3. The following square (stating that x A (y Vx) = (x Ay)W x = x) commutes: 


X — 

A 

—s- X xX 

P\ 



JidjxV 

XxX - 

c 


P\ 



J^Axidx 

X A- 

V 

— XxX. 


(E.118) 
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Here the middle arrow c is the composition 

XxX {XxX)xX ^Xx(XxX) X X X X X. (E.l 19) 

• Let 1 be ‘the ” terminal object in T, with associated arrow X ^X x Ifrom (E.32), 
with Z X, q\ idx, and Y 1 . A top element in an internal lattice is an 
arrow T :1 ^ X such that the following composite arrow is the identity idx-' 

— idvxT A 

X^Xxl ^ XxX^X. (E.120) 

A bottom element is an arrow _L : 1 —> X for which the following arrow is idx-' 

X ^Xxl'‘^^XxX ^X. (E.121) 

• A Heyting algebra in T is a lattice X with T and _L, endowed with an arrow 

—■^iXxX^X, (E.122) 

such that the monos mi and m 2 in the double pullback diagram 


Pi 


^ R ^ 


Pi 


mi I 


m2 


X X X X X XxX XxX xX 


(E.l 23) 


are equivalent (and hence define the same subobject ofX x X x X). 

The reader may check that in Sets these definitions reduce to the usual ones; as one 
can clearly see, finding diagrammatic versions of familiar definitions is an art! 

The most important example of an internal Heyting algebra in a topos is Q . 

Theorem E.22. The truth object Q in a topos T with subobject classifier t :1 ^ Q, 
is a Heyting algebra in the partial ordering m< : Q x Q defined as the equalizer 

of the projection pi : Q x Q Q and the classifying map ■ Li x Q ^ Q of 
the product arrow {t ,t) : 1 Q x Q derived from t: 1 ^ Q. In particular: 

1. The infarrow A : Q x Q ^ Q equals the classifying map X(t.t) of {tf). 

2. The sup arrow \/ : Q x Q ^ Q is the classifying map Xu of the arrow (see below) 

(tjidij) U (idi2,t) : (1 X f2) U (f2 X 1) — >• X . (E.124) 

3. The implication arrow —■»: Q x Q ^ Q is the classifying map Xm< of m<. 

4. The top element T : 1 —>■ coincides with the subobject classifier f: 1 —>■ 

5. The bottom element _L : 1 —>■ ts equal to the classifying map Xo of 0 1. 

6. The negation arrow : Q ^ Q equals the classifying map X± of X : 1 Q. 

For every object Y gTq, this structure makes HomT(T,f2) an external Heyting 
algebra (i.e., in Sets), such that (E.51) is an isomorphism of Heyting algebras. 
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We omit the proof of Theorem E.22, which is a straightforward verification. 

In no. 1, the arrow {t,t) is a special case of the arrow {f,g) defined just before 
(E.91). In no. 2, we need the following construction, applied to the arrows 

(f,idi 2 ) : (lxf2)^ (f2 xf2); (E.125) 

(idi3,f):(f2xl)^(f2xf2), (E.126) 

To define maps like the one in (E.124) in general, recall the the coproduct diagram 


X -^ X + Y < - Y 



Z 


(E.127) 


which is just the opposite of the product diagram (E.32). In particular, for any given 
mono’s m\:X Z and m 2 : T —Z, we obtain a unique map 


(mi,m2) :X + T^Z. (E.128) 

The image of the latter in the sense of its epi-mono factorization (mi, m 2 ) = me, i.e. 

(mi,m2) :X+T-4 xuT ^Z, (E.129) 

is the mono denoted bymUm':2fUT —^Z (which is called m in the above diagram). 

In no. 5, 0 is the initial object in T. Note that the truth arrow t : 1 —is the 
same as the classifying map XMi of the identity arrow idi: 1 — 1 , so that all arrows 
in Theorem E.22 are classifying maps. 

In the presheaf topos [C°P,Sets], where C is any category, products are taken 
pointwise, and also, set-theoretic intersection commutes with pullback of sheaves: 


f*{SnS')=f*{S)nf*{S') (/:D^C; 5,5'G Sieves(C)). (E.130) 

These facts imply that the component at Ac of the natural transformation A is just 

Ac(5,5') =5n5' (5,5' G Sieves(C)), (E.131) 

which in turns implies that if R is taken to be a subfunctor of x f2, so that 

{m<)c : Rq{C) ^ Sieves(C) x Sieves(C) (E.132) 

is the inclusion map, we have (5,5') G Rq{C) iff 5 C 5'. We also find 

->c (5,5') = {/: D ^ C I r5 C /*5'}; (E.133) 

^c{S) = {f:D^C\fiS}, (E.134) 


which are easily checked to be natural in C. Einally, the top element Tc G Sieves (C) 
is the maximal sieve, and similarly the bottom element _Lc is the empty sieve. 
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E.4 Internal frames and locales in sheaf toposes 

As we have seen in §D.l as well as in §C.l 1, a complete Heyting algebra is the same 
qua lattice structure as a frame, except that maps between frames are defined differ¬ 
ently: a frame map is required to preserve order and arbitrary suprema, whereas a 
Heyting algebra frame map preserves order and implication. Furthermore, one has 
locales, which are frames, too, except that maps go in the opposite direction. Hence 
if Frm is the category of frames (within Sets), then the category Loc of locales is 

Loc=Frm°P. (E.135) 

We also recall the bizarre (but wonted) notation X for an object in Loc that is the 
same as the object denoted by ff{X) in Frm, where nothing is implied about the 
spatiality of the frames in question (i.e., it is not necessarily the case that there is 
an actual space X of which the given frame called ff{X) is the topology). In the 
same spirit,/rame maps are written /* : ffiY) ff{X) or : ^(Y) —>■ i^(X), the 
corresponding locale map being f : X ^ Y (which is the same map between the 
same objects), once again, even if no space X in the usual sense is around. 

In any case, in order to define internal frames, locales, or complete Heyting al¬ 
gebras in a topos, one must define completeness of internal lattices. This is difficult 
diagrammatically, but it can be done through the internal language of §E.5, e.g. by 

1= Vs3x(5 C j,x) A V), {S C j,y —>■ X < y), (E.136) 

where S CX and x,y G X (technically, 5 is a variable of type and x and y are 
variables of type X, see §E.5). We may avoid this, however, since due to the iden¬ 
tification (E.84) in Chapter 12 we can work in a sheaf topos Sh(A), where internal 
frames have a simple external description, as follows: there is an equivalence 

Frmsh(x) ^ (Frmsets/^(^))°P (E.137) 

between the category of internal frames in Sh(A) and the category of frame maps in 
Sets with domain ff{X), where the arrows between two such maps 



TTjT* : G{X) G{Y\, 

(E.138) 


' : e{X) ^(Z); 

(E.139) 

are the frame maps 

(p-^ : ^(Z) ^(T) 

(E.140) 

that satisfy 

(p^' = Tty'. 

(E.141) 


This looks more palpable in terms of the “virtual” underlying spaces (i.e. locales): 
If (E.138) - (E.140) are seen as inverse images of maps Tty '■ Y ^ X , Ttz ■ Z ^ X, 
and (p :Y —> Z, then the condition (E.141) corresponds to the equality Ttzotp — Tty- 
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To explain the equivalence (E.137), we underline locales in Sh(X), writing Y_ 
etc.; the corresponding internal frame is denoted by ^(T) (which is the same object 
in Sh(X) as F). The external description of T in Sets, then, is a continuous map 

n-.Y^X, (E.142) 

where T is a locale in Sets (in which X was a a space to begin with), with frame 

ff{Y) = ff{Y_){X). (E.143) 

Also here, the notation n:Y —A is purely symbolic, and stands for a frame map 

: ^(A)^(y), (E.144) 

from which one may reconstruct T as the sheaf 

ff{Y_):U^{V €ff{Y)\V <%-\U)} (t/G^(A)). (E.145) 

The frame maps (E.138) - (E.140) yield an internal frame map : ^(Z) ^{Y_) 

in Sh(A), which is a natural transformation, by defining its components as 

(p-\u):i7t2Hu) ^iTty^U); (E.146) 

S^-^(p-^{S). (E.147) 

As an application, the Dedekind real numbers K can be axiomatized by what is 

called a geometric propositional theory T. In any topos T (with natural numbers 
object), such a theory determines a certain frame whose “points” are defined 

as frame maps ^(T) — Q, where £2 is the subobject classifier in T (more precisely, 
the object of points of ^(T) in T is the subobject of £2^^’^') consisting of frame 
maps). If T® is the theory axiomatizing K, in Sets one simply has the frame 


^(TR)sets = ^(R), (E.148) 

whose points are M. More generally, if T is some geometric propositional theory, and 
A is a space with associated sheaf topos Sh(A), then the internal frame ^(T)sh(x) 
is given by the sheaf (E.145) defined by taking the frame map (E.144) to be the 
inverse image map : ffiX) —>■ ff{X x ^(T)sets) of the projection TZj : 

X X ^(T)sets A onto the first component. Using (E.148), this yields ths frame of 
Dedekind real numbers ^(K) = ^(Tr) in a sheaf topos Sh(A) as the sheaf 


^(]R)sh(x):f^^^(t^xR)- (E.149) 


The Dedekind real numbers object, on the other hand, is given by the sheaf 


(M)sh(z):t/^C(t/,K). (E.150) 

Using (E.85), such results may immediately be transferred to T(A), see §12.1. 
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E.5 Internal language of a topos 

The internal language (also called Mitchell-Benabou language) of a given topos 
T looks like a first-order language, except that it is typed (i.e., many-sorted), in that 
each term a has a certain type, written a ; X, indexed by the objects X of T. For 
example, formulae (by definition) have type Q. In addition, symbols, terms, and 
formulae have a list FV((7) of free variables. Furthermore, the internal language has 
a canonical model in which it may be interpreted, whose carrier is T itself. We often 
make no difference in notation between a as an element of the internal language of 
T and its interpretation [[cr]] in T, which is some arrow in T; the two are so closely 
interwoven that making such a difference would be very artificial. Here are the rules. 

• Constants c of type X correspond to arrows c:l^X (and so in Sets are elements 
of X) and have no free variables, i.e. FV (c) = 0. Here and in what follows, we 
write ‘corresponds to’ in the following sense: for each arrow c :1-^ X there is a 
constant c of type X, and the interpretation of this constant is this arrow. 

• Logically interesting constants are the subobject classifier t :1 Q, which as in 
Theorem E.22 we often write as T, and its antipode _L : 1 —f2, defined as the 
classifying map for the mono 0—^1 (where 0 is the initial object in T). 

• Variables x of type X correspond to the identity idx '.X^X, with FV(x) = {x}. 

• Function symbols f of type Y correspond to arrows f :X ^Y. Thus in addition 
to its type, / has a source (namely X). Arities are unnecessary here; we may 
take Y = Z x • • • x Z, with n terms, and say that / has source Z with arity n, but 
this is superfluous. Similarly, predicate symbols P would be function symbols of 
type and hence they are redundant (clearly, even constants and variables are 
special cases of function symbols, but it is useful to keep them apart). 

• Terms are built by iteratively applying the following formation rules: 

1. Constants and variables are terms of the given type. 

2. If T : V is a term of type X, and f :X ^Y is a function symbol, then /(t) is 
a term of type T, and FV (/(t)) = FV(t). Furthermore, [[/(t)]] = /o t = ft. 

3. If we have n terms T,-: V,- (/ = 1,..., n), with FV(ti ) = • • • = FV(t„) = F, then 
(ti ,..., T„) is a term of type Xi x ■ ■ ■ x X„ and FV {ti,... ,t„) = F. 

If ti has interpretation T, : T —>■ Xi, then {ti,... ,t„) : Y ^ Xi x ■ ■ ■ x X„ is the 
corresponding product arrow, as defined (for n = 1) before (E.91). 

4. One may add free variables to terms; if T : Z with interpretation f.X^Z has 
a single free variable x : X, and we add a free variable y, then the interpretation 
of the revised term t' with EV(t') = {x,y} is t' :X xY V A Z (etc.). 

5. Erom t: X withEV(T) = {zi, ■ ■ ■ ,Zn} withz,: Z,, and n terms O’,: Z,, all having 

the same free variables EV (o,) = {yi ,...,ym}, with y,-: Yj, we can form a new 
term t{<J\ of type X (i.e. the same type t had), with free variables 

EV(T(0i,...,0„)) = {yi,...,y„,}. (E.151) 

As the notation suggests, the interpretation of t(Oi ,..., 0„) is to (oi,..., 0„). 
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• A formula is a term of type Q. A sentence is a formula without free variables, 
which is therefore interpreted as an arrow (p : 1 Q. The rules for formulae are: 

1. Let ^ be a formula with FV {(p) = {x,y}, withx:A and y : F. As in first-order 
logic, we may write (p as (p{x,y). Then {x \ (p{x,y)} is a term of type £2^, with 

FV({x I (p(x,y)}) = {y}. (E.152) 

This rule implements the isomorphism (sometimes called X-conversion) 

HomT(Xxy,f2)2^^HomT(F,f2^), (E.153) 

which follows from the existence of exponentials in a topos. Indeed, (E.153) 
turns the interpretation (p :X xY £2 into an arrow {x \ (p{x,y)} : Y ^ £2^. 
Similarly, from (p : X x Y £2 we obtain a term {(x,y) | (p{x,y)} of type 
A X Y, which is none other than the subobject classified by (p. 

Taking Y = 1 to be the terminal object and using (E.51), we see that 

HomT(A,f2) ^ Sub(A) 9:? Homy (1,^2^), (E.154) 

which shows that £2^ plays the role the power set tX^{X) of A plays in Sets. 

2. If a : Y and t : F are terms with the same free variables, then (7 = T is a 
formula having the same set of free variables as T and O’. If O’: A F and 
T :X -^Y, then the interpretation [[o = t]] : A is the composite arrow 

X^^Y xY ^ Q, (E.155) 

where =y is the classifying map of the diagonal Ay :Y ^Y xY. 

3. If T : F and o : £2^ are terms with the same free variables, then T G O is a 
formula with the same free variables. If T : A —> F and o : A £2^, then 

[[TGo]]:A^-^^Fxf2^-^f2. (E.156) 

4. As in first-order (or propositional) logic, new formulae may be made from old 
ones using the logical connectives A, V, — and To interpret such com¬ 
posites, it is convenient to assume that their components have the same free 
variables, which can always be achieved using rule 4 for term-building above 
(i.e, by adding free variables). So let ^ : A —> and xj/ :X ^ £2he (interpre¬ 
tations of) formulae, and let • be either A, V, or — We then define 

[[(p»\l/]]:X^^^ £2x£2-^£2, (E.157) 

where the arrow • : x is defined from the Heyting algebra structure 

on £2 described in Theorem E.22. Similarly, negation is given by 

[h^]] :A-^f2 (E.158) 
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5. If a formula (p{x,y) contains x freely, as well as other free variables collec¬ 
tively called y, then 3^(p{x,y) is a formula, whose interpretation we now give, 
after a bit of preparation. First, consider the commutative diagram 


^ Z 


^ 1 


/*m 


d 


r 


/ 

4" 4 

iz , r\ 




(E.159) 


where m is a mono, so that its equivalence class defines an element of Sub(F). 
Taking the pullback of either m and /, or, equivalently, of t and Xm o f, we 
obtain a monic f*m:P ^X, whose equivalence class is an element of Sub(F). 
Consequently, any arrow f :X induces a map 

f* : Sub(y) ^ Suh{X), (E.160) 

which is a homomorphism of (external) Heyting algebras (i.e. in Sets). Eor 
example, in Sets, where Sub(X) may be identified with ^{X) (see comment 
after (E.52)), the map f* : ^{Y) ^ 3^(X) is simply the inverse image of 
/. If we regard the lattices Sub(X) and Sub(F) as posetal categories, the map 
f* has both a left-adjoint and a right-adjoint, denoted by 

3f : Sub(X) ^ Sub(F); (E.161) 

V/ : Sub(X) ^ Sub(F). (E.162) 

To justify this suggestive notation, replace XhyXxY and take f :X xY ^Y 
to be p 2 (i.e., projection on the second space). Hence this gives maps 

:Sub(XxT)^Sub(T); (E.163) 

:Sub(XxT)^Sub(T). (E.164) 

In Sets, we identify the Heyting algebras Sub(2f x T) and Sub(T) (now 
Boolean) with ^{X xY) and ^(Y), respectively, and obtain (on A cX xY): 

3p,(A) = {y€Y\ 3,ex : {x,y) GA}; (E.165) 

Vp,(A) = {y G T I : {x,y) G A}. (E.166) 

Returning to a general topos, given (p : X x Y ^ £2, the diagram 


1 


{{x,y) I (p{x,y)} -^ 3p^{{{x,y) \ (p{x,y)}) 


^ 1 


Q G- 


XxY 


P2 


-> Y 


[[3x<P{2:, y)]] 


G Q 


defines the interpretation [[3;c9(x,y)]] (with innocent abuse of notation in ap¬ 
plying the map 3p2). The interpretation of via Vp 2 is quite similar. 
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We now define the (semantic) notion of truth for sentences in the internal lan¬ 
guage of a topos; this is a far-reaching categorical generalization of the idea initially 
studied in the straightforward context of propositional logic, cf. §D.2. 

Definition E.23. 1. A sentence (p in the internal language ofT is true, written Ih (p, 
if its interpretation [[^]] coincides with the subobject classifier f: 1 —^ 

2. An open formula (p{x) is true if its interpretation [[^(x)]] :X Q factors through 
t, or, equivalently, if (the interpretation of) {x \ (p{x)}, seen as the subobject ofX 
classified by (p (as explained between (E.153) and (E.154)), is X itself 

The two clauses of this definition are actually equivalent, since no. 1 is obviously a 
special case of no. 2 by omitting the free variable x (and hence taking X = 1), but 
also, the second reduces to the first, because <p(x) is true iff Vjc^(x) is true. 

As a refinement of this concept of truth, for [[^(x)]]: X —^ as above, which we 

simply write a.s <p : X Q, take an arrow f :Y —A. By definition; 

T Ih ^(/) means Ihtpof. (E.167) 

If ^ is a sentence (i.e. X = 1), this means that Y Ih (p(f) iff (p = Xy (in other words, 
(p classifies Y —> 1). There are (at least) two applications of this idea: 

• The notion of partial truth states that (p is true at stage Y if T Ih (p{f). 

• We say that a set C Tq of objects generates T if for every pair of parallel 
arrows f : X ^Y and h:X ^Y, the property fg = hg for all arrows g : G ^ X 
from all objects G G implies f = g. Eor example, any singleton generates Sets, 
and for any category C, it can be shown that the functors yc generate the presheaf 
topos [C°P,Sets], as C runs through Cq. In general, it then follows that Ih (p holds 
iff G Ih ^ for all G G (and, implicitly, all arrows G^X). 

Both play a role in Chapter 12, in the case where T = [C°P,Sets] for a poset C. By 
the Yoneda Lemma E.15, any arrow a' : yc ^ X bijectively corresponds to some 
element a G Ao(C). In that case, we write Clh (p{a) for Clh (p{cjc'), which by (E.167) 
that the arrow ^ o a': yc ^ factors through the subobject classifier t: 1 Q. 

Kripke-Joyal semantics unfolds the expression Y Ih (p{f) by looking at the for¬ 
mula (p in terms of its constituent terms. As one sees in Chapter 12, since this pro¬ 
cedure may be used iteratively, it is extremely useful for computational purposes. 

Although more than we need (which is the posetal case), we now give the rules 
for the validity of C Ih ^(a) in an arbitrary presheaf topos [C°P, Sets], as just men¬ 
tioned; the posetal case follows in that f : D ^ C can only mean D <C. 

We use the following notation; 

• In clauses 1^ below, we assume (p : X Q, and also y/ ; A —> (as already 

noted, this can always be achieved by adding free variables to (p and/or xj/). 

• In 5-6, we assume ^ ; A x T —> so as to accommodate the free variable y :Y. 

• In 7 and 8, we have T : A —T, with a :X ^Y in no. 7, and (7 : A —in 8. 

We then have the following forcing rules, which generalize the ones given at the 
end of §D.3, and should be seen as theorems of categorical logic and topos theory: 
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1. C Ih (p(a) A t^(a) iff C Ih (p(a) and C Ih 

2. C Ih <p(a) V v/'(a) iff C Ih (p(a) or C Ih 

3. C Ih <p(a) V^(o:) iff D Ih (p{(xf) implies D Ih ^{(xf) for each f: D C. 

4. C II — ‘(p(c>c) iff no arrow f : D ^ C exists such that D Ih ^f{CJCf) holds. 

5. C Ih 3y(p{y){a) iff there exists j3 G Yq{C) such that C Ih (p(a,l5), where 

(E.168) 

^(a,j3) = ^o(a',j3'). (E.169) 

is obtained by combining the maps a': yc ^ X and ji' :yc —>Y into 

{a',li'):yc^XxY. (E.170) 

If (p has no free variables except y, then C Ih 3y(p{y) iff there is j3 G Eo(C) such 
that C Ih (piP). 

6. C Ih \/y(p{y){a) iff D Ih (p{af,j3) for each f: D C and each j3 G Yq{D). 

Here the arrow f: D induces a natural transformation f '.yo^ yc^ yielding 
af = a' o f': yjj ^ X, which combines with j3' -.yo ^Y to 

{af,li):yD^XxY. (E.171) 

Similarly to the previous case, If (p has no free variables except y, we have 

ClhVy^(y) iffDlh^(j3), (E.172) 

for each f: D C and each j3 G Yq{D). 

1. C Ih (t = o’)(a) iff TO a'= (70 a'. 

8. C Ih (t G (7)(o:) iff the arrow 

{aoa',xoa')\yc^ YiP xY (E.173) 

factors through the subobject of Q.^ x Y that is classified by the evaluation map 
ev ; Q.^ xY —As a special case, take F 1 and hence T : A — 1, so that 

a\X^Q}'^Q. (E.174) 

corresponds to a subobject S^X (i.e. classified by (7 = %)■ The above subobject 
of X 1 = is then simply given by the truth arrow t: 1 ^ Q. Writing x G S 
for T G (7 (where x : A is a variable of type X), we therefore obtain the rule: 

9. C Ih (x G 5) (a) iff (7 o a : yc factors through t (in other words, the subobject 
of yc classified by (7 o a is yc itself). 
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Notes 

The standard introduction to category theory by one of its founders is Mac Lane 
(1998); see also the book by his student Awodey (2010), as well as the lecture notes 
by van Oosten (2002) and Cheng (2002). A nice book, which studies set theory 
from the point of view of category theory, is Lawvere & Rosebrugh (2003). At high- 
school level, see also Lawvere & Schanuel (1997) or (informally) Cheng (2015). 

Toposes were invented by Grothendieck in the early 1960s as part of his rebuild¬ 
ing of algebraic geometry; see Artin, Grothendieck, & Verdier (1972). The history 
and philosophy of category theory (including topos theory) has been described by 
Kromer (2007) and by Marquis (2009); for categorical logic see also Marquis & 
Reyes (2012) and Bell (2005). According to a leading category & topos theorist: 

‘category theory was the objective form of dialectical materialism (...) set theory was con¬ 
sidered to be essentially bourgeois since it is founded on the relationship of belonging.’ 
(Marquis & Reyes, 2012, p. 30). 

Books on topos theory and categorical logic we used include (in increasing order 
of scope and sophisitication): Goldblatt (1984), Bell (1988), Borceux (1994), Mac 
Lane & Moerdijk (1992), and last but not least, the encyclopedic Johnstone (2002). 

§E.L Basic definitions 

von Neumann-Bernays-Godel set theory is discusses in some detail in Mendel- 
son (2010); for algebraic set theory see Joyal & Moerdijk (1995). Category theo¬ 
rists also typically rely on the notion of a Grothendieck Universe, see e.g. Mac Lane 
(1998, §1.6), Marquis (2009, §5.5), and Kromer (2007, Ch. 6). 

§E.2. Toposes and functor categories 

An axiomatization of Grothendieck’s toposes (and certain generalizations thereof) 
equivalent to Definition E.12 was given in 1970 by Lawvere and Tierney (it seems 
to have been customary among the pioneers of topos theory, who also include Joyal, 
not to publish their findings too lavishly and in fact no joint paper by Lawvere & 
Tierney recording their definition seems to exist at least in the open literature). 

§E.3. Subobjects and Heyting algebras in a topos 

See Mac Lane & Moerdijk (1992), §§1.8, IV.8, and Borceux (1994), §1.2. 

§E.4. Internal frames and locales in sheaf toposes 

The external description of internal locales in sheaf toposes originates with Joyal 
& Tierney (1984); see also Johnstone (2002), §CL6. 

§E.5. Internal language of a topos 

More details and proofs of the Kripke-Joyal semantics for the internal language 
of a topos may be found in Bell (1988), Ch. 4, Mac Lane & Moerdijk (1992), §IV.6, 
Borceux (1994), §6.6, and Johnstone (2002), §DL2. 

Eor an analysis of the notion of partial truth (as defined here) applied to quantum 
mechanics (differently from our Chapter 12), see Butterfield (2002). 
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completely additive on 
119 

exchangeable, 306 
finitely additive on 119 

onS»(H),59, 119 
on locale, 489 
permutation-invariant, 306 
probability space, 104, 523 
non-commutative, 696 
problem of outcomes, 448 
problem of statistics, 446 
product, 811 
binary, 811 

Product Extension (assumption), 222 
projection, 499, 502, 573 
atomic, 601 
finite, 750 
minimal, 750 
projections, 125, 333 
proof by contradiction, 787 
proposition, 784 
propositional logic, 784 
pullback 
of a map, 90 


of arrows, 812 
pure state space, 31, 765 
normal, 125 

pure thermodynamics phase, 363 
purely logical symbols, 784, 793 
push-forward 

of a diffeomorphism , 90 
of a filter, 640 
pushout, 813 

Q (lost Gospel), 242 
quadratic form, 496 
Quantum Bayesianism (= QBism), 
436 

quantum De Finetti Theorem, 301 
quantum event, 40, 103 
quantum Ising chain, 348 
quantum Ising model, 348 
quantum logic, 459 
Birkhoff-von Neumann, 75-79, 
81,459 

intuitionistic, 471^75 
quantum probability distribution, 40 
quantum random variable, 103 
quantum spin systems, 318 
quantum toposophy, 459 
quantum-mechanical law of strong 
numbers, 314 
quasi-linear, 121 
quasi-local observables, 318 
quasi-local sequence, 303 
quasi-state, 61, 490 
strong, 120 
weak, 120 

quasi-symmetric sequence, 300 

Radon-Nikodym Theorem, 549 
rather below (in lattice), 466 
reading scale, 447 
real numbers 

Dedekind, 461, 489 
lower, 489 
upper, 489 
real rank, 758 
reductio ad absurdum, 787 
regular Lie group action, 262 
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regular polyhedra, 561 
regular space, 83 
relation, 777 
relatively open, 262 
representation, 36 
admissible, 174, 176 
cyclic, 691 
induced, 263 
irreducible, 153, 693 
left-regular, 254 
nondegenerate, 695 
of C*-algebra, 691 
parafermionic, 285 
primary, 319 
skew-adjoint, 155, 188 
super-admissible, 174 
weakly contained, 734 
representations 
disjoint, 319 
equivalent, 164, 319 
equivalent admissible, 176 
quasi-equivalent, 319 
unitarily equivalent, 693 
resolution of the identity 
in a set with a transition 
probability, 32 
resolvent, 577, 581 
Riesz Lemma, 563 
Riesz Representation Theorem, 526 
Riesz-Frechet Theorem, 568 
right translation, 153 
right-annihilator, 760 
root, 166 
positive, 166 

O’(y, IT)-topology, 546 
O-algebra, 523 
O-weak convergence, 512 
O-weak topology, 111,512, 743 
Sakai’s Theorem, 744 
Schatten-von Neumann ideals, 643 
Schmidt Extension (assumption), 
222 

Schrodinger equation, 15, 247, 445, 
446,449,515 


time-dependent, 186 
Schrodinger’s Cat, vii, 79, 439, 449, 
452, 453, 457 

Schrodinger, E., 1, 2, 248, 249, 252, 
439,441,451,452 
Schur duality, 277 
Schur’s Lemma, 153, 693 
Schwartz space, 178 
Scott topology, 485 
second cohomology group of G, 168 
self-adjoint operator 
maximal, 506 

self-adjoint operators, 125, 333 
self-adjointness (of quantization), 
295 

self-consistency equation, 414 
semantic entailment, 34 
semantic equivalence relation, 76 
semantics 
Kripke-Joyal, 831 
propositional logic, 784 
semi-direct product, 256 
regular, 268 

semi-direct product algebroid, 260 
semi-direct product groupoid, 259 
seminorm, 178 
seminorm (internal), 463 
semiring, 532 
fundamental lemma, 533 
sentence, 794 
in topos, 829 
separating duality, 546 
separation theorem, 544 
sequentially complete, 575 
sesquilinear form, 495 
bounded, 576 

set with a transition probability, 31 
set-theoretic universe, 802 
setting of experiment, 199 
sheaf, 818 

of continuous functions, 818 
Sheffer stroke, 787 
shift operator, 392 
sieve, 815 
maximal, 815 
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pullback, 815 
simplex, 28, 561 
Choquet, 560 
SNAG-Theorem, 724 
Sobolev Embedding Theorem, 182 
source map, 806 
space 

C7-compact, 527 
as a groupoid, 259 
compact, 83 
Hausdorff (= T 2 ), 83 
hyperstonean, 748, 761 
locally compact, 83 
Polish, 641 
scattered, 761 
sober, 687 

Stone, 748, 761,780 
stonean, 748, 761 
totally disconnected, 780 
totally separated, 761 
spectral mapping property, 580, 659 
spectral order, 486 
spectral presheaf, 494 
spectral projection, 501, 588 
spectral radius, 578 
formula, 578 

spectral resolution, 500, 611 
in a set with a transition 
probability, 33 
spectral subspace, 501 
spectral theorem for self-adjoint 
operators 

approximation by projections, 592 
bounded measurable functional 
calculus, 591 

continuous functional calculus, 
590 

for compact operators, 612 
for unbounded operators, 633 
multiplication operator, 596, 598 
on finite-dimensional Hilbert 
space, 500 
spectral theory, 515 
spectrum, 500, 577, 581 
Arveson, 757 


Connes, 757 
continuous, 582, 641 
discrete, 582 
joint, 504 
point, 582 
residual, 641 

Spehner-Haake model, 453 
spin, 160, 175, 272 
spontaneous symmetry breaking, 
345, 367^33 
double well, 371-378 
mean-field theories, 409^15 
quantum spin systems, 379-385 
state, 30 

/f-exchangeable, 301 
TT,-normal, 319 
clustering, 322 
coherent, 252, 371 
correlated, 243 
equilibrium, 345 
ergodic, 365 
Gibbs, 384 

ground, 345, 350, 353, 355 
infinitely exchangeable, 301 
KMS, 359 

local equilibrium, 356 
macroscopic, 324 
mixed, 31 
normal, 109 
onB(//),43 
onB(//)sa, 43 
on C{X), 527 
onCo(X), 84, 529 
on C*-algebra, 646 
permutation-invariant, 301, 326 
primary, 319 
probability measure, 28 
product, 243 
pure, 31 
quasi-free, 403 
singular, 112 
trivial at infinity, 366 
uncorrelated, 243 
state space, 28, 30, 763 
normal, 112, 125 
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normal pure, 333 
normal total, 333 
ofB(//),43 
ofB(//)sa, 43 
of C*-algebra, 334, 647 
pure, 334, 647 
states 

^-distinguishable, 446 
Stone spectrum, 781 
Stone’s Representation Theorem, 
780 

Stone’s Theorem, 184 
Stone-Weierstrass Theorem, 555 
strictly convex (normed space), 543 
strong (operator) topology, 574, 742 
strong continuity of group action, 
344 

structure constants, 97 
subcategory, 807 
full, 807 
subfunctor, 817 
subobject, 814 
subobject classifier, 462, 814 
in [C°P, Sets], 816 
subrepresentation, 319 
sup-norm (= supremum norm), 83, 
522 
support 

of function, 522 
of measure, 557 
supremum, 778 
symmetric sequence, 299 
symmetrization operator, 298 
symmetry, 125 

algebraic quantum theory, 
333-366 
Bohr, 127, 334 
Jordan, 126, 334 
Kadison, 126, 334 
Ludwig, 127, 334 
permutation, 275-288 
property of metric, 516 
quantum mechanics, 125-191 
spatial translation, 346 
spontaneously broken, 379 


von Neumann, 127, 334 
weak Jordan, 126, 334 
weakly broken, 379 
’Wigner, 126, 334 
symmetry group, 345 
symplectic manifold, 89 
system of imprimitivity, 258 

tangent bundle (as Lie algebroid), 
260 

target map, 806 
tautological functor, 464 
tautology, 785 
tempered distribution, 178 
tensor category, 773 
tensor product 
algebraic, 697 
C*-norm, 700 
cross-norm, 700 
injective, 700 
maximal C*-norm, 701 
product state, 703 
projective, 701, 772 
spatial, 243 
state, 702 
term, 794 

term formation, 794 
terminal object, 461, 811 
in [C°P,Sets], 815 
terms 

in topos, 828 
tertium non datur, 75 
theorem, 786, 795 
theorem of the highest weight, 166 
theory 

fundamental, 367 
higher-level, 367 
reduced, 367 
reducing, 367 
time-evolution, 345 
Tomita-Takesaki Theorem, 755 
Tomita-Takesaki theory, 754 
top element, 777, 820 
in internal lattice, 824 
topological vector space, 178, 541 
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locally convex, 544 
topos 

definition, 815 
elementary, 815 
topos theory 

and quantum logic, 459^94 
introduction to, 805-833 
total set of states, 115 
trace, 508, 751 
finite, 751 
infinite, 751 
semifinite, 751 
transition probability, 31 
onP(B(//)),47 
on pure state space, 765 
triangle inequality 
metric, 516 
norm, 495 
true, 76, 475, 802 
truth 

at stages, 831 
in topos, 831 
partial, 831 
truth function, 785 
truth object, 461, 814 
in [C°P, Sets], 815 
truth table, 785 

tubular neighbourhood theorem, 727 
twist map, 823 
two-point function, 217 

ultrafilter, 548, 781 
free, 548 
principal, 548 
ultraweak topology, 743 
unbounded multiplier, 681 
uncorrelated unit vector, 220 
uniform space, 765 
uniform structure, 765 
unilateral shift, 309 
unit, 463 

unital commutative C*-subalgebra, 
125, 333 
maximal, 506 
unitary dual, 164 


unitary gauge, 426 

Unitary Invariance (assumption), 222 
unitary operator, 125 
unitary representation, 151 
unitization, 660 
universal generalization, 795 
upper semicontinuous partition, 336 
upper set (= up-set), 819 
upward directed, 759 
Urysohn’s Lemma, 639 

valuation, 491, 784 
vanishing at infinity, 83 
variable, 793 
bound, 794 
free, 794 
in topos, 828 
variance, 25 
vector 

cyclic, 595, 691 
separating, 595 
vector bundle 

as Lie groupoid, 726 
vertex, 813 

von Neumann algebra, 2, 590, 742 
abelian, 747 
center, 318 
definition, 742 
factor, 747 
hyperfinite, 754 
injective, 754 
maximal commutative, 2 
standard form, 755 
von Neumann chain, 437 

wave mechanics, 1 

weak (operator) topology, 574, 742 

weak convergence in B(H), 574 

weak measurability, 152 

weak topology, 546 

weak* topology (= w*-topology), 

546 

weight, 164 
dominant, 165 
of a frame function, 65 
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regular, 165 

well inside (in lattice), 466 
well-formed formula, 784 
Weyl chamber, 165 
Weyl group, 164 
Weyl operator, 154 
Weyl quantization, 251 
Weyl’s Program, 259, 289 
Weyl, H., 18, 68, 172, 188, 251, 289, 
290,515,583 
Whitehead’s Lemma, 170 
Wigner cocycle, 265 
Wigner function, 251 
Wigner’s Theorem, 132, 147 


Wigner, E., 19, 187, 289, 290, 440, 
442, 450, 457 

would-be Goldstone boson, 424 

yes-no questions, 35 
Yoneda embedding, 816 
Yoneda Lemma, 816 
Young diagram, 277 
Young tableau, 277 
standard, 277 

Zariski topology, 690 
Zermelo-Fraenkel set theory, 793 
ZF-axioms, 798 
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